Binaural semi-blind dereverberation of noisy convoluted speech signals

In order to overcome a limited performance of a conventional monaural model, this letter proposes a binaural blind dereverberation model. Its learning rule is derived using a blind least-squares measure by exploiting higher-order characteristics of output components. In order to prevent an unwanted whitening of speech signal, we adopt a semi-blind approach by employing a pre-determined whitening filter. The proposed model is evaluated using several simulated conditions and the results show better speech quality than those of the monaural model. The applicability of the model to the real environment is also shown by applying to real-recorded data. Especially, the proposed model attains much improved word error rates from 13.9+/-5.7(%) to 4.1+/-3.5(%) across 13 speakers for testing in the real speech recognition experiments.

[1]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[2]  Xiaohui Xie,et al.  Learning Curves for Stochastic Gradient Descent in Linear Feedforward Networks , 2003, Neural Computation.

[3]  Lieven De Lathauwer,et al.  Blind Deconvolution of DS-CDMA Signals by Means of Decomposition in Rank-$(1,L,L)$ Terms , 2008, IEEE Transactions on Signal Processing.

[4]  Diego H. Milone,et al.  Objective quality evaluation in blind source separation for speech recognition in a real room , 2007, Signal Process..

[5]  Hiroshi Sawada,et al.  Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters , 2004, IEEE Transactions on Speech and Audio Processing.

[6]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[7]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[8]  J. Ben Rosen,et al.  Convex Kernel Underestimation of Functions with Multiple Local Minima , 2006, Comput. Optim. Appl..

[9]  Aapo Hyvärinen,et al.  Some extensions of score matching , 2007, Comput. Stat. Data Anal..

[10]  Jacob Benesty,et al.  Identification of acoustic MIMO systems: Challenges and opportunities , 2006, Signal Process..

[11]  Russell H. Lambert,et al.  Blind separation of multiple speakers in a multipath environment , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Peter J. W. Rayner,et al.  Blind single channel deconvolution using nonstationary signal processing , 2003, IEEE Trans. Speech Audio Process..

[13]  Liqing Zhang,et al.  Multichannel blind deconvolution of nonminimum-phase systems using filter decomposition , 2004, IEEE Transactions on Signal Processing.

[14]  Asoke K. Nandi,et al.  Multichannel blind deconvolution for source separation in convolutive mixtures of speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Patrizio Campisi,et al.  Blind equalization for correlated input symbols: A Bussgang approach , 2005, IEEE Transactions on Signal Processing.

[16]  Tomohiro Nakatani,et al.  Harmonicity-Based Blind Dereverberation for Single-Channel Speech Signals , 2007, IEEE Transactions on Audio, Speech, and Language Processing.