Cognitively Inspired Audiovisual Speech Filtering

Previous research developments in the field of speech enhancement (such as multi microphone arrays and speech enhancement algorithms) have been implemented into commercial hearing aids for the benefit of the deaf community. In recent years, electronic hardware has advanced to such a level that very sophisticated audio only hearing aids have been developed. It is expected that in the future, conventional hearing aids will be transformed to also make use of visual information with the aid of camera input, combining audio and visual information to improve the quality and intelligibility of speech in real-world noisy environments.

[1]  William K. Pratt,et al.  Scene Adaptive Coder , 1984, IEEE Trans. Commun..

[2]  Amir Hussain,et al.  Binaural sub-band adaptive speech enhancement using artificial neural networks , 1998, Speech Commun..

[3]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications , 1949 .

[4]  R. E. Carlson,et al.  Monotone Piecewise Cubic Interpolation , 1980 .

[5]  A. Murat Tekalp,et al.  Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis , 2007, IEEE Transactions on Multimedia.

[6]  Michael Lindenbaum,et al.  Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[7]  Ben P. Milner,et al.  Effective visually-derived Wiener filtering for audio-visual speech processing , 2009, AVSP.

[8]  Tariq S. Durrani,et al.  A Novel Psychoacoustically Motivated Multichannel Speech Enhancement System , 2007, COST 2102 Workshop.

[9]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[10]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[11]  Maurice Milgram,et al.  Semi Adaptive Appearance Models for lip tracking , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[12]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[13]  Fabien Ringeval,et al.  Maximising Audiovisual Correlation with Automatic Lip Tracking and Vowel Based Segmentation , 2009, COST 2101/2102 Conference.

[14]  Giridharan Iyengar,et al.  Robust detection of visual ROI for automatic speechreading , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).

[15]  Marc Moonen,et al.  Speech enhancement with multichannel Wiener filter techniques in multimicrophone binaural hearing aids. , 2009, The Journal of the Acoustical Society of America.

[16]  A. Murat Tekalp,et al.  Lip feature extraction based on audio-visual correlation , 2005, 2005 13th European Signal Processing Conference.

[17]  S. Sridharan,et al.  A syntactic approach to automatic lip feature extraction for speaker identification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[18]  R. Zelinski,et al.  A microphone array with adaptive post-filtering for noise reduction in reverberant rooms , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[19]  Ben P. Milner,et al.  Maximising audio-visual speech correlation , 2007, AVSP.

[20]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[21]  Christian Jutten,et al.  Log-Rayleigh Distribution: A Simple and Efficient Statistical Representation of Log-Spectral Coefficients , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).