论文信息 - A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

We examine eight different techniques for developing visual representations in machine vision tasks. In particular we compare different versions of principal component and independent component analysis in combination with stepwise regression methods for variable selection. We found that local methods, based on the statistics of image patches, consistently outperformed global methods based on the statistics of entire images. This result is consistent with previous work on emotion and facial expression recognition. In addition, the use of a stepwise regression technique for selecting variables and regions of interest substantially boosted performance.

Javier R. Movellan | Terrence J. Sejnowski | Michael S. Gray

[1] Garrison W. Cottrell,et al. EMPATH: Face, Emotion, and Gender Recognition Using Holons , 1990, NIPS.

[2] M. Bartlett,et al. Face image analysis by unsupervised learning and redundancy reduction , 1998 .

[3] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[4] Garrison W. Cottrell,et al. Representing Face Images for Emotion Classification , 1996, NIPS.

[5] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[6] Javier R. Movellan,et al. Visual Speech Recognition with Stochastic Networks , 1994, NIPS.

[7] Juergen Luettin,et al. Visual Speech and Speaker Recognition , 1997 .

[8] Marian Stewart Bartlett,et al. Classifying Facial Action , 1995, NIPS.