Blind reverberation cancellation techniques

Reverberation, a component of any sound generated in a natural environment, can degrade speech intelligibility or more generally the quality of a signal produced within a room. In a typical setup for teleconferencing, for instance, where the microphones receive both the speech and the reverberation of the surrounding space, it is of interest to have the latter removed from the signal that will be broadcast. A similar need arises for automatic speech recognition systems, where the reverberation decreases the recognition rate. More ambitious applications have addressed the improvement of the acoustics of theatres or even the creation of virtual acoustic environments. In all these cases dereverberation is critical. The process of recovering the source signal by removing the unwanted reverberation is called dereverberation. Usually only a reverberated instance of the signal is available. As a consequence only a blind approach, that is a more difficult task, is possible. In more precise terms, unsupervised or blind audio de-reverberation is the problem of removing reverberation from an audio signal without having explicit data regarding the system and the input signal. Different approaches have been proposed for blind dereverberation. A possible discrimination into two classes can be accomplished by considering whether or not the inverse acoustic system needs to be estimated. The aim of this work is to investigate the problem of blind speech dereverberation, and in particular of the methods based on the explicit estimate of the inverse acoustic system, known as “reverberation cancellation techniques”. The following novel contributions are proposed: the formulation of single and multichannel dereverberation algorithms based on a maximum likelihood (ML) approach and on the natural gradient (NG); a new dereverberation structure that improves the speech and reverberation model decoupling. Experimental results are provided to confirm the capability of these algorithms to successfully dereverberate speech signals. Declaration of originality I hereby declare that the research recorded in this thesis and the thesis itself was composed and originated entirely by myself in the Department of Electronics and Electrical Engineering at The University of Edinburgh. Massimiliano Tonelli

[1]  Stefan Bilbao,et al.  Predictive deconvolution and kurtosis maximization for speech dereverberation , 2006, 2006 14th European Signal Processing Conference.

[2]  Ehud Weinstein,et al.  New criteria for blind deconvolution of nonminimum phase systems (channels) , 1990, IEEE Trans. Inf. Theory.

[3]  C. Johnson,et al.  Theory and design of adaptive filters , 1987 .

[4]  Roberto López-Valcarce,et al.  Second order statistics based blind channel equalization with correlated sources , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[5]  B. Juang,et al.  Harmonicity based dereverberation with maximum a posteriori estimation , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[6]  Mike E. Davies,et al.  A multi-channel maximum likelihood approach to de-reverberation , 2006, 2006 14th European Signal Processing Conference.

[7]  James P. LeBlanc,et al.  Computationally Efficient Norm-Constrained Adaptive Blind Deconvolution using Third-Order Moments , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[8]  Michael S. Brandstein,et al.  WAVELET TRANSFORM EXTREMA CLUSTERING FOR MULTI-CHANNEL SPEECH DEREVERBERATION , 1999 .

[9]  Nikolay Dian Gaubitch Blind identification of acoustic systems and enhancement of reverberant speech , 2007 .

[10]  Patrick A. Naylor,et al.  Generalized Optimal Step-Size for Blind Multichannel LMS System Identification , 2006, IEEE Signal Processing Letters.

[11]  Shoji Makino,et al.  Multiple-point equalization of room transfer functions by using common acoustical poles , 1997, IEEE Trans. Speech Audio Process..

[12]  Shun-ichi Amari,et al.  Novel On-Line Adaptive Learning Algorithms for Blind Deconvolution Using the Natural Gradient Approach , 1997 .

[13]  Rubén Picó,et al.  Time domain simulation of sound diffusers using finite-difference schemes , 2007 .

[14]  Akitoshi Kataoka,et al.  FFT‐based fast conjugate gradient method for real‐time dereverberation system , 2007 .

[15]  Hareo Hamada,et al.  Inverse filter design and equalization zones in multichannel sound reproduction , 1995, IEEE Trans. Speech Audio Process..

[16]  David R. Brillinger,et al.  Time Series: Data Analysis and Theory. , 1982 .

[17]  DeLiang Wang,et al.  A two-stage algorithm for enhancement of reverberant speech , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[18]  John Mourjopoulos On the variation and invertibility of room impulse response functions , 1985 .

[19]  J.-M. Boucher,et al.  A New Method Based on Spectral Subtraction for Speech Dereverberation , 2001 .

[20]  Jont B. Allen,et al.  Invertibility of a room impulse response , 1979 .

[21]  John Mourjopoulos,et al.  A comparative study of least-squares and homomorphic techniques for the inversion of mixed phase signals , 1982, ICASSP.

[22]  Patrick A. Naylor,et al.  Speech Dereverberation , 2010 .

[23]  Jacob Benesty,et al.  Acoustic MIMO Signal Processing (Signals and Communication Technology) , 2006 .

[24]  T Houtgast,et al.  A physical method for measuring speech-transmission quality. , 1980, The Journal of the Acoustical Society of America.

[25]  Michael A. Gerzon Why do equalisers sound different , 2003 .

[26]  O. Kirkeby,et al.  Digital filter design for inversion problems in sound reproduction , 1999 .

[27]  Emanuel A. P. Habets,et al.  Signal-Based Performance Evaluation of Dereverberation Algorithms , 2010, J. Electr. Comput. Eng..

[28]  R. Wiggins Minimum entropy deconvolution , 1978 .

[29]  Emanuel A. P. Habets,et al.  Multi-microphone speech dereverberation using LIME and least squares filtering , 2008, 2008 16th European Signal Processing Conference.

[30]  Yonggang Zhang,et al.  Blind estimation of reverberation time in occupied rooms , 2006, 2006 14th European Signal Processing Conference.

[31]  Jingdong Chen,et al.  Acoustic MIMO Signal Processing , 2006 .

[32]  R. Seara,et al.  Spectral subtraction for reverberation reduction applied to automatic speech recognition , 2006, 2006 International Telecommunications Symposium.

[33]  Marc Moonen,et al.  Multimicrophone Speech Dereverberation: Experimental Validation , 2007, EURASIP J. Audio Speech Music. Process..

[34]  M. Joho A systematic approach to adaptive algorithms for multichannel system identification, inverse modeling, and blind identification , 2001 .

[35]  L E Humes,et al.  Factors affecting the recognition of reverberant speech by elderly listeners. , 2000, Journal of speech, language, and hearing research : JSLHR.

[36]  R. C. Williamson,et al.  Theory and design of broadband sensor arrays with frequency invariant far‐field beam patterns , 1995 .

[37]  J. Cadzow Blind deconvolution via cumulant extrema , 1996, IEEE Signal Process. Mag..

[38]  C. L. S. Gilford The Acoustic Design of Talks Studios and Listening Rooms , 1959 .

[39]  H. Saunders,et al.  Probability, Random Variables and Stochastic Processes (2nd Edition) , 1989 .

[40]  C.R. Johnson,et al.  Admissibility in blind adaptive channel equalization , 1991, IEEE Control Systems.

[41]  Patrick A. Naylor,et al.  An evaluation measure for reverberant speech using decay tail modelling , 2006, 2006 14th European Signal Processing Conference.

[42]  Lang Tong,et al.  A deterministic approach to blind equalization , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[43]  Marc Delcroix,et al.  On robust inverse filter design for room transfer function fluctuations , 2006, 2006 14th European Signal Processing Conference.

[44]  E. Lehmann,et al.  Prediction of energy decay in room impulse responses simulated with an image-source model. , 2008, The Journal of the Acoustical Society of America.

[45]  O. Bonello Corrections to "A New Criterion for the Distribution of Normal Room Modes" , 1981 .

[46]  F. Alton Everest,et al.  Master handbook of acoustics , 1981 .

[47]  Eap Emanuël Habets Single- and multi-microphone speech dereverberation using spectral enhancement , 2007 .

[48]  Rodney A. Kennedy,et al.  Iterative cepstrum-based approach for speech dereverberation , 1999, ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359).

[49]  Hiroshi Sawada,et al.  Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters , 2004, IEEE Transactions on Speech and Audio Processing.

[50]  Ken'ichi Furuya,et al.  Robust Speech Dereverberation Using Multichannel Blind Deconvolution With Spectral Subtraction , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[51]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[52]  Tomohiro Nakatani,et al.  IMPLEMENTATION AND EFFECTS OF SINGLE CHANNEL DEREVERBERATION BASED ON THE HARMONIC STRUCTURE OF SPEECH , 2003 .

[53]  John Vanderkooy,et al.  Transfer-Function Measurement with Maximum-Length Sequences , 1989 .

[54]  Masato Miyoshi,et al.  Blind algorithm for calculating common poles based on linear prediction , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[55]  Marc Delcroix,et al.  Dereverberation of speech signals based on linear prediction , 2004, INTERSPEECH.

[56]  Henrique S. Malvar,et al.  Speech dereverberation via maximum-kurtosis subband adaptive filtering , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[57]  Tomohiro Nakatani,et al.  Fast estimation of a precise dereverberation filter based on speech harmonicity , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[58]  Heinrich Kuttruff,et al.  Room acoustics , 1973 .

[59]  Wim Desmet Boundary element method in acoustics , 1999 .

[60]  S. Haykin Unsupervised adaptive filtering, vol. 1: Blind source separation , 2000 .

[61]  Yingbo Hua,et al.  Previously Published Works Uc Riverside Title: Fast Maximum Likelihood for Blind Identification of Multiple Fir Channels Fast Maximum Likelihood for Blind Identification of Multiple Fir Channels , 2022 .

[62]  Damian Murphy,et al.  HYBRID ROOM IMPULSE RESPONSE SYNTHESIS IN DIGITAL WAVEGUIDE MESH BASED ROOM ACOUSTICS SIMULATION , 2008 .

[63]  M. Tonelli,et al.  A MAXIMUM LIKELIHOOD APPROACH TO BLIND AUDIO DE-REVERBERATION , 2004 .

[64]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[65]  Louis Dunn Fielder Analysis of traditional and reverberation-reducing methods of room equalization , 2003 .

[66]  James R. Hopgood Nonstationary signal processing with application to reverberation cancellation in acoustic environments , 2000 .

[67]  S. R. Mahadeva Prasanna,et al.  Speech enhancement using excitation source information , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[68]  Patrick A. Naylor,et al.  A noise-robust dual filter approach to multichannel blind system identification , 2007, 2007 15th European Signal Processing Conference.

[69]  E.A.P. Habets,et al.  Single-Channel Speech Dereverberation based on Spectral Subtraction , 2004 .

[70]  John Mourjopoulos,et al.  Results for Room Acoustics Equalisation Based on Smoothed Responses , 2003 .

[71]  John Mourjopoulos,et al.  Errors in Real-Time Room Acoustics Dereverberation* , 2004 .

[72]  Marc Font,et al.  Multi-microphone Signal Processing for Automatic Speech Recognition in Meeting Rooms , 2005 .

[73]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[74]  Davide Rocchesso,et al.  A Numerical Investigation of the Representation of RoomTransfer Functions for Arti cial Reverberation , 1995 .

[75]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[76]  Soura Dasgupta,et al.  Blind channel equalization with colored sources based on second-order statistics: a linear prediction approach , 2001, IEEE Trans. Signal Process..

[77]  Lang Tong,et al.  Blind identification and equalization based on second-order statistics: a time domain approach , 1994, IEEE Trans. Inf. Theory.

[78]  Deepa Kundur,et al.  Blind Image Deconvolution , 2001 .

[79]  C. Thompson,et al.  MODELING THE ACOUSTIC TRANSFER FUNCTION OF A ROOM , 2022 .

[80]  Gareth Jones,et al.  Elementary number theory , 2019, The Student Mathematical Library.

[81]  Mark Kahrs,et al.  Applications of digital signal processing to audio and acoustics , 1998 .

[82]  Ronald P. Genereux Adaptive Loudspeaker Systems: Correcting for the Acoustic Environment , 1990 .

[83]  Ruey-Wen Liu,et al.  A fundamental theorem for multiple-channel blind equalization , 1997 .

[84]  William G. Gardner,et al.  The virtual acoustic room , 1992 .

[85]  K. Furuya Noise reduction and dereverberation using correlation matrix based on the multiple-input/output inverse-filtering theorem (MINT) , 2001 .

[86]  Smith,et al.  Mathematics of the Discrete Fourier Transform (DFT) with Audio Applications , 2007 .

[87]  Lang Tong,et al.  A new approach to blind identification and equalization of multipath channels , 1991, [1991] Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems & Computers.

[88]  Les E. Atlas,et al.  Strategies for improving audible quality and speech recognition accuracy of reverberant speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[89]  A. Kulowski Algorithmic representation of the ray tracing technique , 1985 .

[90]  Patrick A. Naylor,et al.  A Practical Multichannel Dereverberation Algorithm using Multichannel Dypsa and Spatiotemporal Averaging , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[91]  Marc Moonen,et al.  Subspace Methods for Multimicrophone Speech Dereverberation , 2003, EURASIP J. Adv. Signal Process..

[92]  John Mourjopoulos,et al.  Real-Time Room Equalization Based on Complex Smoothing: Robustness Results , 2004 .

[93]  Masato Miyoshi,et al.  Inverse filtering of room acoustics , 1988, IEEE Trans. Acoust. Speech Signal Process..

[94]  Jacob Benesty,et al.  Speech Enhancement , 2010 .

[95]  T. Kailath,et al.  A least-squares approach to blind channel identification , 1995, IEEE Trans. Signal Process..

[96]  M. Vorländer Simulation of the transient and steady‐state sound propagation in rooms using a new combined ray‐tracing/image‐source algorithm , 1989 .

[97]  M.R. Raghuveer,et al.  Bispectrum estimation: A digital signal processing framework , 1987, Proceedings of the IEEE.

[98]  Sybil P. Parker,et al.  McGraw-Hill dictionary of scientific and technical terms , 1976 .

[99]  A. J. Watkins,et al.  Effects of a Complex Reflection on Vowel Identification , 2000 .

[100]  Jacob Benesty,et al.  Adaptive common root estimation and the common zeros problem in blind channel identification , 2005, 2005 13th European Signal Processing Conference.

[101]  Jean-Marc Jot,et al.  An analysis/synthesis approach to real-time artificial reverberation , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[102]  P.A. Naylor,et al.  A Practical Adaptive Blind Multichannel Estimation Algorithm with Application to Acoustic Impulse Responses , 2007, 2007 15th International Conference on Digital Signal Processing.

[103]  M. Triki Ar Source Modeling Based on Spatiotemporally Diverse Multichannel Outputs and Application to Multimicrophone Dereverberation , 2007, 2007 15th International Conference on Digital Signal Processing.

[104]  Gary W. Elko,et al.  Microphone array systems for hands-free telecommunication , 1996, Speech Commun..

[105]  M. Davies,et al.  A BLIND MULTICHANNEL DEREVERBERATION ALGORITHM BASED ON THE NATURAL GRADIENT , 2010 .

[106]  Dirk T. M. Slock,et al.  Delay and Predict Equalization for Blind Speech Dereverberation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[107]  John Mourjopoulos Comments on 'Analysis of Traditional and Reverberation-Reducing Methods of Room Equalization' , 2003 .

[108]  Peter Kabal,et al.  Reverberant speech enhancement using cepstral processing , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[109]  D. Brillinger Time series - data analysis and theory , 1981, Classics in applied mathematics.

[110]  Takuya Yoshioka,et al.  Robust decomposition of inverse filter of channel and prediction error filter of speech signal for dereverberation , 2006, 2006 14th European Signal Processing Conference.

[111]  Angelo Farina,et al.  Simultaneous Measurement of Impulse Response and Distortion with a Swept-Sine Technique , 2000 .

[112]  Monson H. Hayes,et al.  Statistical Digital Signal Processing and Modeling , 1996 .

[113]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[114]  E. Zwicker,et al.  Subdivision of the audible frequency range into critical bands , 1961 .

[115]  Marc Delcroix,et al.  On the Use of Lime Dereverberation Algorithm in an Acoustic Environment With a Noise Source , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[116]  Patrick A. Naylor,et al.  An extended normalized multichannel FLMS algorithm for blind channel identification , 2006, 2006 14th European Signal Processing Conference.

[117]  D. Ward,et al.  ON THE USE OF LINEAR PREDICTION FOR DEREVERBERATION OF SPEECH , 2003 .

[118]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[119]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[120]  S. Gudvangen,et al.  Comparison of pole-zero and all-zero modelling of acoustic transfer functions (echo cancellation) , 1992 .

[121]  Masashi Unoki,et al.  Refinement of an MTF-based speech dereverberation method using an optimal inverse-MTF filter , 2006 .

[122]  Mark Sandler,et al.  STATISTICAL MEASURES OF EARLY REFLECTIONS OF ROOM IMPULSE RESPONSES , 2007 .

[123]  Stephen A. Dyer,et al.  Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..

[124]  Bayya Yegnanarayana,et al.  Enhancement of reverberant speech using LP residual signal , 2000, IEEE Trans. Speech Audio Process..

[125]  Emanuel A. P. Habets,et al.  Blind estimation of reverberation time based on the distribution of signal decay rates , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[126]  Philippe Loubaton,et al.  Prediction error methods for time-domain blind identification of multichannel FIR filters , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[127]  Douglas L. Jones,et al.  Fast algorithms for blind estimation of reverberation time , 2004, IEEE Signal Processing Letters.

[128]  S. Kopuz,et al.  Boundary Element Method In TheDevelopment Of Vehicle Body StructuresFor Better Interior Acoustics , 1970 .

[129]  Rodney A. Kennedy,et al.  Equalization in an acoustic reverberant environment: robustness results , 2000, IEEE Trans. Speech Audio Process..

[130]  M. Petyt,et al.  Finite Element Techniques for Acoustics , 1983 .

[131]  Tomohiro Nakatani,et al.  Harmonicity Based Dereverberation for Improving Automatic Speech Recognition Performance and Speech Intelligibility , 2005, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[132]  Murray Hodgson,et al.  Effect of noise and occupancy on optimal reverberation times for speech intelligibility in classrooms. , 2002, The Journal of the Acoustical Society of America.

[133]  D. V. Maercke,et al.  Binaural simulation of concert halls: A new approach for the binaural reverberation process , 1993 .

[134]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[135]  C. L. Nikias,et al.  Signal processing with higher-order spectra , 1993, IEEE Signal Processing Magazine.

[136]  Andrew Sekey,et al.  An Objective Measure for Predicting Subjective Quality of Speech Coders , 1992, IEEE J. Sel. Areas Commun..

[137]  L. Tong,et al.  Multichannel blind identification: from subspace to maximum likelihood methods , 1998, Proc. IEEE.

[138]  Les E. Atlas,et al.  Acoustic diversity for improved speech recognition in reverberant environments , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[139]  Jacob Benesty,et al.  Improving robustness of blind adaptive multichannel identification algorithms using constraints , 2005, 2005 13th European Signal Processing Conference.

[140]  Jacob Benesty,et al.  A least squares component normalization approach to blind channel identification , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[141]  Nicholas W. D. Evans,et al.  On the fundamental limitations of spectral subtraction: An assessment by automatic speech recognition , 2005, 2005 13th European Signal Processing Conference.

[142]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[143]  Fabio Rocca,et al.  Near optimal blind deconvolution , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[144]  Masato Miyoshi,et al.  Blind dereverberation algorithm for speech signals based on multi-channel linear prediction , 2005 .

[145]  Tomohiro Nakatani,et al.  Blind dereverberation of single channel speech signal based on harmonic structure , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[146]  Jean-François Cardoso,et al.  Equivariant adaptive source separation , 1996, IEEE Trans. Signal Process..

[147]  Jerry M. Mendel,et al.  Identification of nonminimum phase systems using higher order statistics , 1989, IEEE Trans. Acoust. Speech Signal Process..

[148]  D. Botteldooren Finite‐difference time‐domain simulation of low‐frequency room acoustic problems , 1995 .

[149]  Daniel D. Lee,et al.  Bayesian regularization and nonnegative deconvolution for room impulse response estimation , 2006, IEEE Transactions on Signal Processing.

[150]  Patrick A. Naylor,et al.  Multi-microphone speech dereverberation using spatio-temporal averaging , 2004, 2004 12th European Signal Processing Conference.

[151]  Marc Delcroix,et al.  Precise Dereverberation Using Multichannel Linear Prediction , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[152]  Vilhelm Lassen Jordan Acoustical Criteria for Auditoriums and Their Relation to Model Techniques , 1970 .

[153]  Jerry M. Mendel,et al.  Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications , 1991, Proc. IEEE.

[154]  Methods for objective and subjective assessment of quality Perceptual evaluation of speech quality ( PESQ ) : An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs , 2002 .

[155]  Tapio Takala,et al.  Waveguide Mesh Method for Low-Frequency Simulation of Room Acoustics , 1995 .

[156]  James P. Cowan Handbook of Environmental Acoustics , 1993 .

[157]  J. Polack La transmission de l'energie sonore dans les salles , 1988 .

[158]  Jingdong Chen,et al.  Microphone Array Signal Processing , 2008 .

[159]  Chrysostomos L. Nikias,et al.  EVAM: an eigenvector-based algorithm for multichannel blind deconvolution of input colored signals , 1995, IEEE Trans. Signal Process..