论文信息 - Non-linear representations, sensor reliability estimation and context-dependent fusion in the audiovisual recognition of speech in noise

Non-linear representations, sensor reliability estimation and context-dependent fusion in the audiovisual recognition of speech in noise

The paper involves the recognition of French audiovisual vowels at various signal-to-noise ratios (SNRs). It deals with a new non-linear preprocessing of the audio data which enables an estimation of the reliability of the audio sensor in relation to SNR, and a significant increase in the recognition performances at the output of the fusion process.

Jean-Luc Schwartz | Anne Guérin-Dugué | Pascal Teissier

[1] Jeanny Hérault,et al. Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets , 1997, IEEE Trans. Neural Networks.

[2] D. Stork,et al. Speechreading by Man and Machine: Models, Systems, and Applications , 1996 .

[3] Jean-Luc Schwartz,et al. Comparing models for audiovisual fusion in a noisy-vowel recognition task , 1999, IEEE Trans. Speech Audio Process..

[4] Jean-Luc Schwartz,et al. Constrained Neural Network for Estimating Sensor Reliability in Sensors Fusion , 1997, IWANN.

[5] Isabelle Bloch. Information combination operators for data fusion: a comparative review with classification , 1996, IEEE Trans. Syst. Man Cybern. Part A.