论文信息 - An approach to statistical lip modelling for speaker identification via chromatic feature extraction

An approach to statistical lip modelling for speaker identification via chromatic feature extraction

This paper presents a novel technique for the tracking of moving lips for the purpose of speaker identification. In our system, a model of the lip contour is formed directly from chromatic information in the lip region. Iterative refinement of contour point estimates is not required. Colour features are extracted from the lips via concatenated profiles taken around the lip contour. Reduction of order in lip features is obtained via principal component analysis (PCA) followed by linear discriminant analysis (LDA). Statistical speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments performed on the M2VTS/sup 1/ database, show encouraging results.

[1] Lorenzo Torresani,et al. 2D Deformable Models for Visual Speech Analysis , 1996 .

[2] Juergen Luettin,et al. Locating and tracking facial speech features , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[3] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .

[4] S. Sridharan,et al. A syntactic approach to automatic lip feature extraction for speaker identification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5] Akio Ogihara,et al. Speech Recognition Based on Fusion of Visual and Auditory Information Using Full-Frame Color Image (Special Section of Letters Selected from the 1996 IEICE General Conference) , 1996 .

[6] Jiri Matas,et al. Statistical Chromaticity Models for Lip Tracking with B-splines , 1997, AVBPA.

[7] Douglas A. Reynolds,et al. Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..