A novel lip reading algorithm by using localized ACM and HMM: Tested for digit recognition

Abstract Lip contour tracking is an integral part of lip reading application. Fast and accurate lip tracking is an important step in lip reading. This paper uses a novel active contour model for lip tracking and proposes geometrical feature extraction approach for lip reading. Effect of individual features are compared and a joint feature model is obtained by combining weighted decision obtained by a feature vector of difference in inner area, height and width of lip. Ergodic hidden markov model (HMM) is used as a classifier. For each digit Markov Model is tested for 3 states and 5 states. Videos of English digit from 0 to 9 have been recorded for recognition test. Cuave database is used for comparison along with an in-house database. While doing computation of feature vectors, only significant frames are used to reduce the computation complexity. Results of experimentations on digit utterances are given to show that the maximum recognized digit can be used for important programming command of computerized numerical control machines.

[1]  Darryl Stewart,et al.  Comparison of Image Transform-Based Features for Visual Speech Recognition in Clean and Corrupted Videos , 2008, EURASIP J. Image Video Process..

[2]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[3]  Alan Wee-Chung Liew,et al.  An Automatic Lipreading System for Spoken Digits With Limited Training Data , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Allen R. Tannenbaum,et al.  Localizing Region-Based Active Contours , 2008, IEEE Transactions on Image Processing.

[6]  S. Palanivel,et al.  Lip reading of hearing impaired persons using HMM , 2011, Expert Syst. Appl..

[7]  Sunil S. Morade,et al.  Automatic Lip Tracking and Extraction of Lip Geometric Features for Lip Reading , 2013 .

[8]  Gerasimos Potamianos,et al.  An image transform approach for HMM based automatic lipreading , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[9]  Timothy F. Cootes,et al.  Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Sabah Jassim,et al.  Visual words for lip-reading , 2010, Defense + Commercial Sensing.

[11]  Kuntal Sengupta,et al.  Lip geometric features for human-computer interaction using bimodal speech recognition: comparison and analysis , 2004, Speech Commun..

[12]  Jenq-Neng Hwang,et al.  Lipreading from color video , 1997, IEEE Trans. Image Process..

[13]  J.N. Gowdy,et al.  CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.