TRINICON for Dereverberation of Speech and Audio Signals

In this chapter, we develop an analytical top-down approach to the problem of blind dereverberation of speech and audio signals based on TRINICON (TRIple-N Independent component analysis for CONvolutive mixtures), a general framework for broadband adaptive Multi-Input Multi-Output (MIMO) signal processing. Two fundamentally different approaches to the dereverberation problem for realistic scenarios can be distinguished: The “identification-and-inversion approach”, which results in a two-step procedure consisting of blind identification of the acoustic MIMO mixing system, followed by an inversion of the identified system. As an alternative, the “direct-inverse approach” blindly estimates the inverse of the acoustic mixing system directly. As shown in this chapter, for both cases TRINICON yields the information-theoretically optimum estimation procedures in a unified way and allows for a direct comparison between the approaches, paves the way to synergies, and yields various useful insights for practical realizations. This chapter also relates other known algorithms, and presents novel improved algorithms as special cases of the generic concept.

[1]  Walter Kellermann,et al.  On the causality problem in time-domain blind source separation and deconvolution algorithms , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[2]  Walter Kellermann,et al.  Simultaneous localization of multiple sound sources using blind adaptive MIMO filtering , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3]  Hiroshi Sawada,et al.  A Causal Frequency-Domain Implementation of a Natural Gradient Multichannel Blind Deconvolution and Source Separation Algorithm , 2004 .

[4]  R. Lambert Multichannel blind deconvolution: FIR matrix algebra and separation of multipath mixtures , 1996 .

[5]  Antoine Souloumiac,et al.  Jacobi Angles for Simultaneous Diagonalization , 1996, SIAM J. Matrix Anal. Appl..

[6]  Walter Kellermann,et al.  Blind Source Separation for Convolutive Mixtures: A Unified Treatment , 2004 .

[7]  J. Cardoso,et al.  Blind beamforming for non-gaussian signals , 1993 .

[8]  Les E. Atlas,et al.  Strategies for improving audible quality and speech recognition accuracy of reverberant speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[10]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[11]  Marc Moonen,et al.  Subspace Methods for Multimicrophone Speech Dereverberation , 2003, EURASIP J. Adv. Signal Process..

[12]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[13]  Bayya Yegnanarayana,et al.  Enhancement of reverberant speech using LP residual signal , 2000, IEEE Trans. Speech Audio Process..

[14]  Shoko Araki,et al.  Equivalence between frequency domain blind source separation and frequency domain adaptive beamforming , 2002, ICASSP.

[15]  Schuster,et al.  Separation of a mixture of independent signals using time delayed correlations. , 1994, Physical review letters.

[16]  H. Brehm,et al.  Description and generation of spherically invariant speech-model signals , 1987 .

[17]  Jose C. Principe,et al.  Simultaneous Diagonalization in the Frequency Domain (SDIF) for Source Separation , 2000 .

[18]  Hui Liu,et al.  A deterministic approach to blind identification of multi-channel FIR systems , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Maurice G. Kendall,et al.  The advanced theory of statistics , 1945 .

[20]  James V. Stone Blind deconvolution using temporal predictability , 2002, Neurocomputing.

[21]  Chrysostomos L. Nikias,et al.  EVAM: an eigenvector-based algorithm for multichannel blind deconvolution of input colored signals , 1995, IEEE Trans. Signal Process..

[22]  Mohamed Najim,et al.  Cancelling convolutive and additive coloured noises for speech enhancement , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  Markus Hofbauer,et al.  Optimal linear separation and deconvolution of acoustical convolutive mixtures , 2005 .

[24]  W. Kellermann,et al.  A Fundamental Relation Between Blind and Supervised Adaptive Filtering Illustrated for Blind Source Separation and Acoustic Echo Cancellation , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.

[25]  Hiroshi Sawada,et al.  SPECTRAL SMOOTHING FOR FREQUENCY-DOMAIN BLIND SOURCE SEPARATION , 2003 .

[26]  Jacob Benesty,et al.  Robust extended multidelay filter and double-talk detector for acoustic echo cancellation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[27]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[28]  Walter Kellermann,et al.  A GENERALIZATION OF A CLASS OF BLIND SOURCE SEPARATION ALGORITHMS FOR CONVOLUTIVE MIXTURES , 2003 .

[29]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .

[30]  K. Mardia Measures of multivariate skewness and kurtosis with applications , 1970 .

[31]  S. Amari,et al.  Estimating Functions in Semiparametric Statistical Models , 1997 .

[32]  Benesty,et al.  Adaptive eigenvalue decomposition algorithm for passive acoustic source localization , 2000, The Journal of the Acoustical Society of America.

[33]  Michael S. Brandstein On the use of explicit speech modeling in microphone array applications , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[34]  C. Fancourt,et al.  The coherence function in blind source separation of convolutive mixtures of non-stationary signals , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[35]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[36]  Jacob Benesty,et al.  Time Delay Estimation , 2004 .

[37]  Hiroshi Sawada,et al.  Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters , 2004, IEEE Transactions on Speech and Audio Processing.

[38]  Ken'ichi Furuya,et al.  Robust Speech Dereverberation Using Multichannel Blind Deconvolution With Spectral Subtraction , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Te-Won Lee,et al.  Independent Vector Analysis: An Extension of ICA to Multivariate Components , 2006, ICA.

[40]  Shun-ichi AMARIyy,et al.  NATURAL GRADIENT LEARNING WITH A NONHOLONOMIC CONSTRAINT FOR BLIND DECONVOLUTION OF MULTIPLE CHANNELS , 1999 .

[41]  Walter Kellermann,et al.  The TRINICON framework for adaptive MIMO signal processing with focus on the generic Sylvester constraint , 2011 .

[42]  K. Furuya,et al.  Two-channel blind deconvolution of nonminimum phase FIR systems , 1997 .

[43]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[44]  S.C. Douglas,et al.  Multichannel blind deconvolution and equalization using the natural gradient , 1997, First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications.

[45]  Philip Schniter,et al.  FREQUENCY DOMAIN REALIZATION OF A MULTICHANNEL BLIND DECONVOLUTION ALGORITHM BASED ON THE NATURAL GRADIENT , 2003 .

[46]  Joel Goldman,et al.  Detection in the presence of spherically symmetric random vectors , 1976, IEEE Trans. Inf. Theory.

[47]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[48]  Masato Miyoshi,et al.  Inverse filtering of room acoustics , 1988, IEEE Trans. Acoust. Speech Signal Process..

[49]  Noboru Ohnishi,et al.  A method of blind separation for convolved non-stationary signals , 1998, Neurocomputing.

[50]  Dennis R. Morgan,et al.  Exploring permutation inconsistency in blind separation of speech signals in a reverberant environment , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[51]  Christian Jutten,et al.  Blind source separation for convolutive mixtures , 1995, Signal Process..

[52]  Walter Kellermann,et al.  A real-time blind source separation scheme and its application to reverberant and noisy acoustic environments , 2006, Signal Process..

[53]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[54]  S. Amari,et al.  Geometrical structures of FIR manifold and their application to multichannel blind deconvolution , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[55]  Guo Wei,et al.  Convolutive Blind Source Separation of Non-stationary Source , 2011 .

[56]  J. Wellner [Empirical Processes and Applications: An Overview]: Discussion , 1996 .

[57]  C. L. Nikias,et al.  Signal processing with higher-order spectra , 1993, IEEE Signal Processing Magazine.

[58]  Te-Won Lee,et al.  Blind Speech Separation , 2007, Blind Speech Separation.

[59]  Walter Kellermann,et al.  TRINICON-based Blind System Identification with Application to Multiple-Source Localization and Separation , 2007, Blind Speech Separation.

[60]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[61]  Broadhead,et al.  Performance of some sparseness criterion blind deconvolution methods in the presence of noise , 2000, The Journal of the Acoustical Society of America.

[62]  Kiyohiro Shikano,et al.  Comparison of time-domain ICA, frequency-domain ICA and multistage ICA for blind source separation , 2002, 2002 11th European Signal Processing Conference.

[63]  K. Furuya Noise reduction and dereverberation using correlation matrix based on the multiple-input/output inverse-filtering theorem (MINT) , 2001 .

[64]  Walter Kellermann,et al.  TRINICON: a versatile framework for multichannel blind signal processing , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[65]  K. Matsuoka,et al.  Minimal distortion principle for blind source separation , 2002, Proceedings of the 41st SICE Annual Conference. SICE 2002..

[66]  Walter Kellermann,et al.  RELATION BETWEEN BLIND SYSTEM IDENTIFICATION AND CONVOLUTIVE BLIND SOURCE SEPARATION , 2005 .

[67]  Patrick A. Naylor,et al.  Speech Dereverberation , 2010 .

[68]  Jacob Benesty,et al.  Separation and Dereverberation of Speech Signals with Multiple Microphones , 2005 .

[69]  César Caballero-Gaudes,et al.  Robust blind identification of SIMO channels: a support vector regression approach , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[70]  Noboru Murata,et al.  An Approach to Blind Source Separation of Speech Signals , 1998 .

[71]  Sharon Gannot Subspace methods for multi microphone speech dereverberation , 2001 .

[72]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .

[73]  Shoko Araki,et al.  The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech , 2003, IEEE Trans. Speech Audio Process..

[74]  R. Wiggins Minimum entropy deconvolution , 1978 .

[75]  Kung Yao,et al.  A representation theorem and its applications to spherically-invariant random processes , 1973, IEEE Trans. Inf. Theory.

[76]  Daniel W. E. Schobben,et al.  A frequency domain blind signal separation method based on decorrelation , 2002, IEEE Trans. Signal Process..

[77]  J. Burgess Active adaptive sound control in a duct: A computer simulation , 1981 .

[78]  Masato Miyoshi,et al.  Blind algorithm for calculating common poles based on linear prediction , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[79]  Henrique S. Malvar,et al.  Speech dereverberation via maximum-kurtosis subband adaptive filtering , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[80]  Scott C. Douglas,et al.  Blind Separation of Acoustic Signals , 2001, Microphone Arrays.

[81]  Michael I. Taroudakis,et al.  On the use of matched-field processing and hybrid algorithms for vertical slice tomography , 1997 .

[82]  Peter J. Huber,et al.  Robust Statistics , 2005, Wiley Series in Probability and Statistics.

[83]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[84]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[85]  Atsuo Hiroe,et al.  Solution of Permutation Problem in Frequency Domain ICA, Using Multivariate Probability Density Functions , 2006, ICA.

[86]  Walter Kellermann,et al.  A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics , 2005, IEEE Transactions on Speech and Audio Processing.

[87]  Hiroshi Sawada,et al.  A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.

[88]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[89]  Takuya Yoshioka,et al.  Dereverberation by Using Time-Variant Nature of Speech Production System , 2007, EURASIP J. Adv. Signal Process..

[90]  Walter Kellermann,et al.  Multidimensional localization of multiple sound sources using averaged directivity patterns of Blind Source Separation systems , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[91]  Yuan Xu,et al.  Lecture notes on orthogonal polynomials of several variables , 2004 .

[92]  Heinrich Kuttruff,et al.  Room acoustics , 1973 .

[93]  Walter Kellermann,et al.  Exploiting Narrowband Efficiency for Broadband Convolutive Blind Source Separation , 2007, EURASIP J. Adv. Signal Process..