Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening

The performance of many microphone array processing techniques deteriorates in the presence of reverberation. To provide a widely applicable solution to this longstanding problem, this paper generalizes existing dereverberation methods using subband-domain multi-channel linear prediction filters so that the resultant generalized algorithm can blindly shorten a multiple-input multiple-output (MIMO) room impulse response between a set of unknown number of sources and a microphone array. Unlike existing dereverberation methods, the presented algorithm is developed without assuming specific acoustic conditions, and provides a firm theoretical underpinning for the applicability of the subband-domain multi-channel linear prediction methods. The generalization is achieved by using a new cost function for estimating the prediction filter and an efficient optimization algorithm. The proposed generalized algorithm makes it easier to understand the common background underlying different dereverberation methods and future technical development. Indeed, this paper also derives two alternative dereverberation methods from the proposed algorithm, which are advantageous in terms of computational complexity. Experimental results are reported, showing that the proposed generalized algorithm effectively achieves blind MIMO impulse response shortening especially in a mid-to-high frequency range.

[1]  Eap Emanuël Habets Single- and multi-microphone speech dereverberation using spectral enhancement , 2007 .

[2]  Gernot M. Engel,et al.  The Hadamard-Fischer inequality for a class of matrices defined by eigenvalue monotonicity , 1976 .

[3]  Peter Vary,et al.  Low Delay Noise Reduction and Dereverberation for Hearing Aids , 2009, EURASIP J. Adv. Signal Process..

[4]  R. G. Leonard,et al.  A database for speaker-independent digit recognition , 1984, ICASSP.

[5]  Yiling Xu,et al.  Problem of linear predictive algorithms for blind multichannel identification , 2002, SPIE/OSA/IEEE Asia Communications and Photonics.

[6]  Ken'ichi Furuya,et al.  Robust Speech Dereverberation Using Multichannel Blind Deconvolution With Spectral Subtraction , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Biing-Hwang Juang,et al.  Speech Dereverberation Based on Maximum-Likelihood Estimation With Time-Varying Gaussian Source Model , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Francesco Nesta,et al.  Convolutive BSS of Short Mixtures by ICA Recursively Regularized Across Frequencies , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Takuya Yoshioka,et al.  Dereverberation by Using Time-Variant Nature of Speech Production System , 2007, EURASIP J. Adv. Signal Process..

[10]  Thomas H. Pate,et al.  Exterior products, elementary symmetric functions, and the Fischer determinant inequality , 1997 .

[11]  Rainer Martin,et al.  Speech enhancement based on minimum mean-square error estimation and supergaussian priors , 2005, IEEE Transactions on Speech and Audio Processing.

[12]  Dirk T. M. Slock,et al.  Delay and Predict Equalization for Blind Speech Dereverberation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[13]  Walter Kellermann,et al.  A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics , 2005, IEEE Transactions on Speech and Audio Processing.

[14]  Walter Kellermann,et al.  TRINICON for Dereverberation of Speech and Audio Signals , 2010, Speech Dereverberation.

[15]  B. S. Ramakrishna,et al.  Intelligibility of speech under nonexponential decay conditions. , 1975, The Journal of the Acoustical Society of America.

[16]  Masakiyo Fujimoto,et al.  Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  DeLiang Wang,et al.  A two-stage algorithm for enhancement of reverberant speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[18]  Takuya Yoshioka,et al.  Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  B. Liu,et al.  Implementation of the Digital Phase Vocoder Using the Fast Fourier Transform , 2022 .

[20]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[21]  Tomohiro Nakatani,et al.  Speech Dereverberation and Denoising Based on Time Varying Speech Model and Autoregressive Reverberation Model , 2010 .

[22]  M. Schroeder New Method of Measuring Reverberation Time , 1965 .

[23]  Henrique S. Malvar,et al.  Speech dereverberation via maximum-kurtosis subband adaptive filtering , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[24]  Michael S. Brandstein,et al.  Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[25]  DeLiang Wang,et al.  A two-stage algorithm for one-microphone reverberant speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Kiyotoshi Matsuoka,et al.  A neural net for blind separation of nonstationary signals , 1995, Neural Networks.