Temporal and Spatial Features for Visual Speech Recognition

Ali Jafari Sheshpoli; Ali Nadian-Ghomsheh

Temporal and Spatial Features for Visual Speech Recognition

Publish place: Fifth International Conference on Electrical and Computer Engineering with Emphasis on Indigenous Knowledge

Publish Year: 1396

نوع سند: مقاله کنفرانسی

زبان: English

This Paper With 8 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/725449

شناسه ملی سند علمی:

COMCONF05_473

تاریخ نمایه سازی: 21 اردیبهشت 1397

Abstract:

Speech recognition from visual data is in important step towards communication when audio is not available. This paper considers several hand crafted features including HOG, MBH, DCT, LBP, MTC, and their combinations for recognizing speech from a sequence of images. Several classifiers including SVM, decision trees, K -nearest neighbor algorithm and the sub-space K-nearest algorithm were tested feature evaluation. Further, the application of PCA for dimensionality reduction was considered in this study. Two sets of tests were carried out in this study: lip pose recognition and recognition of isolated words. For evaluation, the MIRACL-VC1 data set was considered. Self -dependent tests reached an accuracy of over 95% while in the self-independent tests, the maximum accuracy of recognition was about 52%.

Keywords:

Speech recognition , temporal features , spatial features , dimensionality reduction , classification

Authors

Ali Jafari Sheshpoli

Cyber space research inst., Shahid Beheshti University, Tehran, Iran

Ali Nadian-Ghomsheh

Cyber space research inst., Shahid Beheshti University, Tehran, Iran

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/725449

شناسه ملی سند علمی:

COMCONF05_473

تاریخ نمایه سازی: 21 اردیبهشت 1397

How to Cite to This Paper:

If you want to refer to this Paper in your research work, you can simply use the following phrase in the resources section:

Jafari Sheshpoli, Ali and Nadian-Ghomsheh, Ali,1396,Temporal and Spatial Features for Visual Speech Recognition,Fifth International Conference on Electrical and Computer Engineering with Emphasis on Indigenous Knowledge,Tehran,https://civilica.com/doc/725449

Scientometrics

The specifications of the publisher center of this Paper are as follows:

Ranking of Shahid Beheshti University

Type of center: دانشگاه دولتی

Paper count: 32,534

In the scientometrics section of CIVILICA, you can see the scientific ranking of the Iranian academic and research centers based on the statistics of indexed articles.

مقالات مرتبط جدید