论文信息 - Continuous audio-visual digit recognition using decision fusion

Continuous audio-visual digit recognition using decision fusion

Audio-visual speech recognition systems can be divided into systems that integrate audio-visual features before decisions are made (feature fusion) and those that integrate decisions of separate recognisers for each modality (decision fusion).

Georg Meyer | Jeff Mulligan

[1] D W Massaro,et al. American Psychological Association, Inc. Evaluation and Integration of Visual and Auditory Information in Speech Perception , 2022 .

[2] Giridharan Iyengar,et al. Large-vocabulary audio-visual speech recognition by machines and humans , 2001, INTERSPEECH.

[3] Yochai Konig,et al. "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[4] Gregory J. Wolff,et al. Preprocessing video images for neural learning of lipreading , 1994, Other Conferences.

[5] Stephen J. Cox,et al. A Comparison of Active Shape Model and Scale Decomposition Based Features for Visual Speech Recognition , 1998, ECCV.

[6] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[7] Gerasimos Potamianos,et al. An image transform approach for HMM based automatic lipreading , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[8] Martin Heckmann,et al. Optimal weighting of posteriors for audio-visual speech recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9] D. Massaro,et al. Perceiving Talking Faces , 1995 .

[10] Javier R. Movellan,et al. Channel Separability in the Audio-Visual Integration of Speech: A Bayesian Approach , 1996 .

[11] Jean-Luc Schwartz,et al. Comparing models for audiovisual fusion in a noisy-vowel recognition task , 1999, IEEE Trans. Speech Audio Process..

[12] P. L. Silsbee. Sensory integration in audiovisual automatic speech recognition , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[13] Richard Lippmann,et al. Speech recognition by machines and humans , 1997, Speech Commun..