Wind Noise Reduction: Signal Processing Concepts

With the technological progress, devices, such as mobile phones, tablet computers or hearing aids, can be used in a large variety of every-day situations for mobile communication. Acoustic background noise signals, which are picked up with the desired speech signal, can impair the signal quality and the intelligibility of a conversation. A special noise type is generated outdoors, if the microphone is exposed to a wind stream resulting in strong-rumbling noise, which is highly non-stationary. As a result, conventional approaches for noise reduction fail in the case of noise induced by wind turbulences. This thesis is focused on the development of signal processing concepts, which reduce the undesired effects of wind noise. The key contributions are: • Signal analysis of wind noise • Digital signal model for wind noise generation • Signal processing algorithms for detection and reduction of wind noise signals. All these topics are considered with the focus on the development of algorithms for single and dual microphone systems. The analysis of recorded wind signals is the first step and gives valuable information for the estimation and reduction of wind noise. Furthermore it leads to a signal model for the generation of reproducible artificial wind noise signals. For the enhancement of the disturbed speech, an estimate of the underlying wind noise signal is required. In contrast to state-of-the-art noise estimation algorithms, the spectral shape and energy distribution is exploited for the distinction between speech and wind noise components leading to a novel estimation scheme of the wind noise short-term power spectrum. Considering a system with two microphone inputs, the complex coherence function of the two recorded signals is exploited for wind noise estimation. In addition to commonly used noise reduction schemes by spectral weighting, an innovative concept for speech enhancement is developed by using techniques known from artificial bandwidth extension. Highly disturbed speech parts are replaced by corresponding parts from an artificial speech signal. Objective measures indicate a significant increase of both the signal-to-noise ratio and the speech intelligibility. Besides, two application examples show that the proposed methods are very efficient and robust in realistic scenarios.

[1]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications , 1949 .

[2]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  Janina Fels,et al.  Fuzzy Sound Field Classification in Devices with Multiple Acoustic Sensors , 2012, IWAENC.

[4]  W. Weibull A Statistical Distribution Function of Wide Applicability , 1951 .

[5]  Patrick A. Naylor,et al.  Corpus based reconstruction of speech degraded by wind noise , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[6]  Peter Vary,et al.  Wind noise short term power spectrum estimation using pitch adaptive inverse binary masks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  A.V. Oppenheim,et al.  Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.

[8]  Jörg Wuttke Microphones and Wind , 1992 .

[9]  Jacob Benesty,et al.  Audio Signal Processing for Next-Generation Multimedia Communication Systems , 2004 .

[10]  Alex Acero,et al.  Automatic Removal of Typed Keystrokes From Speech Signals , 2007, IEEE Signal Processing Letters.

[11]  Timo Gerkmann,et al.  Cepstral Smoothing with Reduced Computational Complexity , 2010, Sprachkommunikation.

[12]  N. Wiener The Wiener RMS (Root Mean Square) Error Criterion in Filter Design and Prediction , 1949 .

[13]  Patrick A. Naylor,et al.  Speech Dereverberation , 2010 .

[14]  Peter Vary,et al.  Noise suppression by spectral magnitude estimation —mechanism and theoretical limits— , 1985 .

[15]  F. Itakura Line spectrum representation of linear predictor coefficients of speech signals , 1975 .

[16]  Jesper Jensen,et al.  MMSE based noise PSD tracking with low complexity , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[18]  W. Bastiaan Kleijn,et al.  Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[20]  Benesty,et al.  Adaptive eigenvalue decomposition algorithm for passive acoustic source localization , 2000, The Journal of the Acoustical Society of America.

[21]  Michael Elad,et al.  Audio Inpainting , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Mike Brookes,et al.  Mask-based enhancement for very low quality speech , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  G. Carter Coherence and time delay estimation , 1987, Proceedings of the IEEE.

[24]  Unto K. Laine,et al.  Splitting the unit delay [FIR/all pass filters design] , 1996, IEEE Signal Process. Mag..

[25]  Peter Vary,et al.  Measurement, analysis and simulation of wind noise signals for mobile communication devices , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[26]  Christophe Beaugeant,et al.  Blind estimation of the coherent-to-diffuse energy ratio from noisy speech signals , 2011, 2011 19th European Signal Processing Conference.

[27]  A. Gray,et al.  Least squares glottal inverse filtering from the acoustic speech waveform , 1979 .

[28]  P. Vary,et al.  Wind Noise Detection : Signal Processing Concepts for Speech Communication , 2016 .

[29]  Jianqin Zhou,et al.  On discrete cosine transform , 2011, ArXiv.

[30]  K. S. Rhebergen,et al.  A Speech Intelligibility Index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners. , 2005, The Journal of the Acoustical Society of America.

[31]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[32]  Richard C. Hendriks,et al.  Noise power estimation based on the probability of speech presence , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[33]  H. Dillon,et al.  An international comparison of long‐term average speech spectra , 1994 .

[34]  Peter Vary,et al.  Digital Speech Transmission: Enhancement, Coding and Error Concealment , 2006 .

[35]  John H. L. Hansen Speech enhancement employing adaptive boundary detection and morphological based spectral constraints , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[36]  Simon J. Godsill,et al.  Detection and suppression of keyboard transient noise in audio streams with auxiliary keybed microphone , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Marco Jeub,et al.  Joint dereverberation and noise reduction for binaural hearing aids and mobile phones , 2012 .

[38]  Peter Vary,et al.  Dual MicrophoneWind Noise Reduction by Exploiting the Complex Coherence , 2014, ITG Symposium on Speech Communication.

[39]  James Durbin,et al.  The fitting of time series models , 1960 .

[40]  Wai C. Chu,et al.  Speech Coding Algorithms: Foundation and Evolution of Standardized Coders , 2003 .

[41]  James M. Kates,et al.  Digital hearing aids. , 2008, Harvard health letter.

[42]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[43]  Christophe Beaugeant,et al.  Do We Need Dereverberation for Hand-Held Telephony? , 2010 .

[44]  Christophe Beaugeant,et al.  Low complexity single microphone tonal noise reduction in vehicular traffic environments , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[45]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[46]  M. Lighthill On sound generated aerodynamically I. General theory , 1952, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[47]  J. C. Lam,et al.  A study of Weibull parameters using long-term wind observations , 2000 .

[48]  J. Makhoul,et al.  Vector quantization in speech coding , 1985, Proceedings of the IEEE.

[49]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[50]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[51]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[52]  Christophe Beaugeant,et al.  Dual microphone noise PSD estimation for mobile phones in hands-free position exploiting the coherence and speech presence probability , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[53]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[54]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[55]  H. Sabine Room Acoustics , 1953, The SAGE Encyclopedia of Human Communication Sciences and Disorders.

[56]  Neil J. Bershad,et al.  Comments on "Time delay estimation using the LMS adaptive filter-static behavior" , 1985, IEEE Trans. Acoust. Speech Signal Process..

[57]  Rainer Martin,et al.  Speech enhancement based on minimum mean-square error estimation and supergaussian priors , 2005, IEEE Transactions on Speech and Audio Processing.

[58]  Peter Vary,et al.  Noise PSD estimation by logarithmic baseline tracing , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[59]  F. Ren,et al.  Wind noise reduction method for speech recording using multiple noise templates and observed spectrum fine structure , 2006, 2006 International Conference on Communication Technology.

[60]  Jae S. Lim,et al.  The unimportance of phase in speech enhancement , 1982 .

[61]  Joshua G. W. Bernstein,et al.  Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners. , 2009, The Journal of the Acoustical Society of America.

[62]  Arne Leijon,et al.  Nonnegative HMM for Babble Noise Derived From Speech HMM: Application to Speech Enhancement , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[63]  Tao Wu,et al.  The Mechanisms Creating Wind Noise in Microphones , 2003 .

[64]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[65]  M. Strasberg Dimensional analysis of windscreen noise , 1988 .

[66]  Tobias Rosenkranz Cobebuch-basierte Geräuschreduktion mit cepstraler Modellierung , 2010, Sprachkommunikation.

[67]  Heinrich W. Lollmann,et al.  Allpass based analysis synthesis filter banks : design and application , 2011 .

[68]  Peter Vary,et al.  Selflearning Codebook Speech Enhancement , 2014, ITG Symposium on Speech Communication.

[69]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[70]  R. Plomp,et al.  Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. , 1990, The Journal of the Acoustical Society of America.

[71]  Boaz Rafaely,et al.  Microphone Array Signal Processing , 2008 .

[72]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[73]  Kuldip K. Paliwal,et al.  Spectral subband centroid features for speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[74]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[75]  Z. Şen,et al.  First-order Markov chain approach to wind speed modelling , 2001 .

[76]  Peter Vary,et al.  A Modified Minimum Statistics Algorithm for Reducing Time Varying Harmonic Noise , 2010, Sprachkommunikation.

[77]  R. H. Warring,et al.  Handbook of noise and vibration control , 1983 .

[78]  Christophe Beaugeant,et al.  Noise reduction for dual-microphone mobile phones exploiting power level differences , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[79]  Christophe Beaugeant,et al.  Robust dual-channel noise power spectral density estimation , 2011, 2011 19th European Signal Processing Conference.

[80]  T. Gerkmann,et al.  Phase estimation in speech enhancement — Unimportant, important, or impossible? , 2012, 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel.

[81]  Rainer Martin,et al.  An evaluation of noise power spectral density estimation algorithms in adverse acoustic environments , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[82]  T. W. Lambert,et al.  Modern estimation of the parameters of the Weibull wind speed distribution for wind energy analysis , 2000 .

[83]  Gereon Schäfer,et al.  RWTH Aachen , 2012 .

[84]  M. Lighthill On sound generated aerodynamically II. Turbulence as a source of sound , 1954, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[85]  Tim Haulick,et al.  Noise subtraction with parametric recursive gain curves , 1999, EUROSPEECH.

[86]  Hing-Cheung So,et al.  Speech enhancement in car noise envoronment based on an analysis-synthesis approach using harmonic noise model , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[87]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[88]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[89]  Walter Kellermann,et al.  A Morphological Approach to Single-Channel Wind-Noise Suppression , 2012, International Workshop on Acoustic Signal Enhancement.

[90]  Rahim Saeidi,et al.  Time-frequency constraints for phase estimation in single-channel speech enhancement , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[91]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[92]  D. V. Anderson,et al.  FFT-Based Block Processing in Speech Enhancement: Potential Artifacts and Solutions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[93]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[94]  Israel Cohen,et al.  Speech enhancement for non-stationary noise environments , 2001, Signal Process..

[95]  Christophe Beaugeant,et al.  Single microphone wind noise reduction using techniques of artificial bandwidth extension , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[96]  Stefan Ernst,et al.  Combination of two-channel spectral subtraction and adaptive wiener post-filtering for noise reduction and dereverberation , 1996, 1996 8th European Signal Processing Conference (EUSIPCO 1996).

[97]  Timo Gerkmann,et al.  STFT Phase Improvement for Single Channel Speech Enhancement , 2012, IWAENC.

[98]  D I Jones,et al.  An application of a Markov chain noise model to wind generator simulation , 1986 .

[99]  Barbara Mayer,et al.  Handbook Of Engineering Acoustics , 2016 .

[100]  Gerhard Schmidt,et al.  Bandwidth Extension of Speech Signals , 2008, Lecture Notes in Electrical Engineering.

[101]  Christophe Beaugeant,et al.  Single microphone wind noise PSD estimation using signal centroids , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[102]  G. Corcos The structure of the turbulent pressure field in boundary-layer flows , 1964, Journal of Fluid Mechanics.

[103]  Yi Hu,et al.  A comparative intelligibility study of single-microphone noise reduction algorithms. , 2007, The Journal of the Acoustical Society of America.

[104]  Christophe Beaugeant,et al.  Evaluation of single- and dual-channel noise power spectral density estimation algorithms for mobile phones , 2011 .

[105]  Jesper Jensen,et al.  A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[106]  Joerg Bitzer,et al.  Multi-channel algorithms for wind noise reduction and signal compensation in binaural hearing aids , 2010 .