Geometrical Feature Extraction for Robust Speech Recognition

Visual information from lip contour has been successfully shown to improve the robustness of automatic speech recognition especially in noisy environments. In this paper, a novel method for lip reading is presented. In the method, hue information of input images is used for lip area detection. Then, a set of morphological operations is applied to detect lip contour. Polynomial fitting is designed for geometrical feature extraction. With the extracted features, hidden Markov models and Gaussian mixture models are trained to recognize speech. The experimental results demonstrated that the proposed method improved speech recognition rates in noisy environment. Another advantage of the method is its robustness to lighting variances

[1]  Timothy F. Cootes,et al.  Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.

[3]  Chalapathy Neti,et al.  Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.

[4]  Michiel E. Hochstenbach,et al.  A Jacobi-Davidson Type SVD Method , 2001, SIAM J. Sci. Comput..

[5]  Yochai Konig,et al.  "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Alan Wee-Chung Liew,et al.  Lip contour extraction from color images using a deformable model , 2002, Pattern Recognit..

[7]  Lorenzo Torresani,et al.  2D Deformable Models for Visual Speech Analysis , 1996 .

[8]  Marvin L. Bittinger,et al.  Elementary and Intermediate Algebra: Concepts and Applications , 1995 .

[9]  Russell M. Mersereau,et al.  Lip feature extraction towards an automatic speechreading system , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[10]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[12]  A. Macleod,et al.  LIPS, TEETH, AND THE BENEFITS OF LIPREADING , 1989 .

[13]  M. Lie UNSUPERVISED LIP SEGMENTATION UNDER NATURAL CONDITIONS , 1999 .

[14]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[15]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .