Synergy of Lip-Motion and Acoustic Features in Biometric Speech and Speaker Recognition

Maycel Isaac Faraj; Josef Bigun

Synergy of Lip-Motion and Acoustic Features in Biometric Speech and Speaker Recognition
Journal article, 2007

This paper presents the scheme and evaluation of a robust audio-visual digit-and-speaker-recognition system using lip motion and speech biometrics. Moreover, a liveness verification barrier based on a person’s lip movement is added to the system to guard against advanced spoofing attempts such as replayed videos. The acoustic and visual features are integrated at the feature level and evaluated first by a Support Vector Machine for digit and speaker identification and, then, by a Gaussian Mixture Model for speaker verification. Based on % 300 different personal identities, this paper represents, to our knowledge, the first extensive study investigating the added value of lip motion features for speaker and speech-recognition applications. Digit recognition and person-identification and verification experiments are conducted on the publicly available XM2VTS database showing favorable results (speaker verification is 98 percent, speaker identification is 100 percent, and digit identification is 83 percent to 100 percent).

lip motion

SVM

lip reading

normal image velocity

GMM

speaker recognition

biometrics

motion estimation

Speech recognition

normal image flow

Author

Maycel Isaac Faraj

Chalmers, Signals and Systems

Josef Bigun

Chalmers, Signals and Systems

Other publications Research

IEEE Transactions on Computers

0018-9340 (ISSN)

Vol. 56 9 6-

Subject Categories (SSIF 2011)

Computer Vision and Robotics (Autonomous Systems)

More information

Created

10/6/2017

Synergy of Lip-Motion and Acoustic Features in Biometric Speech and Speaker Recognition Journal article, 2007