Musical-Noise-Free Speech Enhancement Based on Optimized Iterative Spectral Subtraction

In this paper, we provide a theoretical analysis of the amount of musical noise in iterative spectral subtraction, and its optimization method for the least musical noise generation. To achieve high-quality noise reduction with low musical noise, iterative spectral subtraction, i.e., iteratively applied weak nonlinear signal processing, has been proposed. Although the effectiveness of the method has been reported experimentally, there have been no theoretical studies. Therefore, in this paper, we formulate the generation process of musical noise by tracing the change in kurtosis of noise spectra, and conduct a comparison of the amount of musical noise for different parameter settings but the same achieved level of noise attenuation. Furthermore, we theoretically derive the optimal internal parameters that generate no musical noise. It is clarified that to find a fixed point in kurtosis yields the no-musical-noise property. Comparative experiments with commonly used noise reduction methods show the proposed method's efficacy.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  Morise Masanori,et al.  Musical tone reduction on iterative spectral subtraction based on optimum flooring parameters , 2010 .

[3]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[4]  Kiyohiro Shikano,et al.  Theoretical Analysis of Musical Noise in Generalized Spectral Subtraction Based on Higher Order Statistics , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Rainer Martin,et al.  Spectral Subtraction Based on Minimum Statistics , 2001 .

[6]  Tetsuya Shimamura,et al.  Improved spectral subtraction utilizing iterative processing , 2007 .

[7]  Sheng Li,et al.  Iterative spectral subtraction method for millimeter-wave conducted speech enhancement , 2010 .

[8]  W. Nelson Statistical Methods for Reliability Data , 1998 .

[9]  Kiyohiro Shikano,et al.  Julius - an open source real-time large vocabulary recognition engine , 2001, INTERSPEECH.

[10]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[11]  J. Bert Keats,et al.  Statistical Methods for Reliability Data , 1999 .

[12]  Kiyohiro Shikano,et al.  Blind Source Separation Combining Independent Component Analysis and Beamforming , 2003, EURASIP J. Adv. Signal Process..

[13]  Kiyohiro Shikano,et al.  Musical noise generation analysis for noise reduction methods based on spectral subtraction and MMSE STSA estimation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Kiyohiro Shikano,et al.  Automatic optimization scheme of spectral subtraction based on musical noise assessment via higher-order statistics , 2008 .

[15]  Kiyohiro Shikano,et al.  Theoretical analysis of musical noise in generalized spectral subtraction: Why should not use power/amplitude subtraction? , 2010, 2010 18th European Signal Processing Conference.

[16]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[17]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[18]  T. Hasan,et al.  Iterative noise power subtraction technique for improved speech quality , 2008, 2008 International Conference on Electrical and Computer Engineering.

[19]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[20]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[21]  E. Stacy A Generalization of the Gamma Distribution , 1962 .

[22]  Kohei Yamashita,et al.  Spectral subtraction iterated with weighting factors , 2002, Speech Coding, 2002, IEEE Workshop Proceedings..

[23]  Kah-Chye Tan,et al.  Postprocessing method for suppressing musical noise generated by spectral subtraction , 1998, IEEE Trans. Speech Audio Process..

[24]  Guo Li,et al.  Improved Voice Activity Detection Based on Iterative Spectral Subtraction and Double Thresholds for CVR , 2008, 2008 Workshop on Power Electronics and Intelligent Transportation System.