Non-linear representations, sensor reliability estimation and context-dependent fusion in the audiovisual recognition of speech in noise

The paper involves the recognition of French audiovisual vowels at various signal-to-noise ratios (SNRs). It deals with a new non-linear preprocessing of the audio data which enables an estimation of the reliability of the audio sensor in relation to SNR, and a significant increase in the recognition performances at the output of the fusion process.