Phonetic Evaluation of Dynamic Audio-Visual Speaker Identification

Vahid Asadpour; Farzad Towhidkhah; Mehdi Homayoun poor

Phonetic Evaluation of Dynamic Audio-Visual Speaker Identification

Publish place: 14th Iranian Conference on Electric Engineering

Publish Year: 1385

نوع سند: مقاله کنفرانسی

زبان: English

This Paper With 6 Page And PDF Format Ready To Download

دریافت فایل کامل Paper

Certificate
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این Paper:

https://civilica.com/doc/54694

شناسه ملی سند علمی:

ICEE14_023

تاریخ نمایه سازی: 25 تیر 1387

Abstract:

Biometry is the science of human identification by their specific physical characteristics. Features like fingerprints, voice, iris and many others have been used by biometry techniques. However, the use of dynamic Audio-Visual features has the advantage of improving robustness to the environmental noise. These parameters are exclusively dependent on the neuromuscular properties of speaker, so imitation of valid speakers and false acceptance could be reduced to a large extent. Furthermore, we have focused on visual feature extraction and proposed a dynamic lip model system to extract the intrinsic features of moving limbs such as viscosity, elasticity, damping and mass from speaker recordings. These features are complementary to the vectors of lip motion and their first and second order derivations. Audio features are extracted using noise robust relative spectra perceptual linear prediction (RASTA-PLP) and combination of audio and video features is done using a multistream pseudo-synchronized hidden Markov model. The superior performance for the proposed system is demonstrated on a large multispeaker database of continuously spoken digits and a sentence that is phonetically rich. On a recognition task at 15 dB acoustic signal-to-noise ratio (SNR) the noise robust acoustic features lead to 9% error rate and combined noise robust acoustic features and dynamic muscle features to 1.5% error rate. False rejection has been reduced up to 0.5 percent and true identification has been increased up to 98.5% in low signal to noise ratios as 3 dB for the audio-visual system.

Keywords:

biometry , speaker recognition , Markov chain , lip movement , lip-tracking

Authors

Vahid Asadpour

Department of Biomedical Engineering Amirkabir University of Technology, Tehran, Iran

Farzad Towhidkhah

Department of Computer Engineering Amirkabir University of Technology, Tehran, Iran

Mehdi Homayoun poor

Department of Computer Engineering Amirkabir University of Technology, Tehran, Iran

مراجع و منابع این Paper:

لیست زیر مراجع و منابع استفاده شده در این Paper را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود Paper لینک شده اند :

A. Q. Summerfield, *Lipreading and audio-visual speech perception, ' Philos. ...
James V. Haxby, Elizabeth A. Hoffman, and M. Ida Gobbini, ...
K. Messer, J. Matas, J. Kittler and K. Johnsson, ،، ...
M. A. Cohen, S. Grossberg, ،Parallel Auditory Filtering by Sustained ...
K.W. Grant and L. D. Braida, *Evaluating the articulation index ...
H. McGurk and J. MacDonald, *Hearing lips and seeing voices, ...
vol. 264, pp. 746-748, 1976. ...
Stephane Dupont and Juergen Luettin, _ 0Audio-Visual Speech Modeling for ...
Marcus E. Henneck, K. Venkatesh Prasad, David G. Stork, *Using ...
Juergen Luettin, Neil A. Thacker, Steve W. Beet , *Locating ...
Micheal Vogt , ،Fast Matching of a Dynamic Lip Model ...
J. Matas, K. Jonsson, J. Kittler, :Fast Face Localisation and ...
Simon Lucey, Sridha Sridharan, Vinod Chandran, *Adaptive Mouth Segmentation Using ...
Barbara Knappmeyer, Ian m. Thornton, Heinrich H. Bulthoff, ،The Use ...
B. Fasela, Juergen Luettin, ،0Automatic Facial Expression Analysis: A Survey', ...
Marcus E. Hennecke, K. Venkatesh Prasad, David G. Stork, *Using ...
A. _ Hill, ،0The Heat of Shortening and Dynamic Constants ...
A. J. O_Toole, D. A. Roark and H. Abdi, *Recognizing ...
K. Lander and _ Bruce, *Recognizing famous faces: Exploring the ...
H. McGurk , J. MacDonald, *Hearing lips and seeing voices, ...
A. Q. Summerfield, *Lipreading and audio-visual speech perception, Philos. Trans. ...
H. Fletcher, *Speech and Hearing in C O mmunication _ ...
T. Fu, X. X. Liu, L. H. Liang, X. Pi ...

نمایش کامل مراجع