Temporal and Spatial Features for Visual Speech Recognition

Publish Year: 1396
نوع سند: مقاله کنفرانسی
زبان: English
View: 416

This Paper With 8 Page And PDF Format Ready To Download

  • Certificate
  • من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

شناسه ملی سند علمی:

COMCONF05_473

تاریخ نمایه سازی: 21 اردیبهشت 1397

Abstract:

Speech recognition from visual data is in important step towards communication when audio is not available. This paper considers several hand crafted features including HOG, MBH, DCT, LBP, MTC, and their combinations for recognizing speech from a sequence of images. Several classifiers including SVM, decision trees, K -nearest neighbor algorithm and the sub-space K-nearest algorithm were tested feature evaluation. Further, the application of PCA for dimensionality reduction was considered in this study. Two sets of tests were carried out in this study: lip pose recognition and recognition of isolated words. For evaluation, the MIRACL-VC1 data set was considered. Self -dependent tests reached an accuracy of over 95% while in the self-independent tests, the maximum accuracy of recognition was about 52%.

Authors

Ali Jafari Sheshpoli

Cyber space research inst., Shahid Beheshti University, Tehran, Iran

Ali Nadian-Ghomsheh

Cyber space research inst., Shahid Beheshti University, Tehran, Iran