Enhancing noisy speech signals using orthogonal moments

This study describes a new approach to enhance noisy speech signals using the discrete Tchebichef transform (DTT) and the discrete Krawtchouk transform (DKT). The DTT and DKT are based on well-known orthogonal moments: the Tchebichef and Krawtchouk moments, respectively. The representations of speech signals using a limited number of moment coefficients and their behaviour in the domain of orthogonal moments are shown. The method involves removing noise from the signal using a minimum-mean-square error in the domain of the DTT or DKT. According to comparisons with traditional methods, the initial experiments yield promising results and show that orthogonal moments are applicable in the field of speech signal enhancement. The application of orthogonal moments could be extended to speech analysis, compression and recognition.

[1]  Yi Hu,et al.  Speech enhancement based on wavelet thresholding the multitaper spectrum , 2004, IEEE Transactions on Speech and Audio Processing.

[2]  Dennis H. Klatt,et al.  Prediction of perceived phonetic distance from critical-band spectra: A first step , 1982, ICASSP.

[3]  Raveendran Paramesran,et al.  Speech signals representation by Discrete Transforms , 2009, 2009 International Conference for Technical Postgraduates (TECHPOS).

[4]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[5]  Weisi Lin Multimedia Analysis, Processing and Communications , 2011 .

[6]  Huazhong Shu,et al.  Image analysis by discrete orthogonal dual Hahn moments , 2007, Pattern Recognit. Lett..

[7]  J. Flusser,et al.  Moments and Moment Invariants in Pattern Recognition , 2009 .

[8]  Sofia Ben Jebara,et al.  Perceptual speech quality measures separating speech distortion and additive noise degradations , 2012, Speech Commun..

[9]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[10]  Sim Heng Ong,et al.  Image Analysis by Tchebichef Moments , 2001, IEEE Trans. Image Process..

[11]  George Carayannis,et al.  Speech enhancement from noise: A regenerative approach , 1991, Speech Commun..

[12]  Ronald E. Crochiere,et al.  A weighted overlap-add method of short-time Fourier analysis/Synthesis , 1980 .

[13]  Pascal Scalart,et al.  A two-step noise reduction technique , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Raveendran Paramesran,et al.  New orthogonal polynomials for speech signal and image processing , 2012, IET Signal Process..

[15]  R. Mukundan,et al.  Some computational aspects of discrete orthonormal moments , 2004, IEEE Transactions on Image Processing.

[16]  Raveendran Paramesran,et al.  Image Analysis Using Hahn Moments , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Eliathamby Ambikairajah,et al.  Speech denoising using perceptual modification of Wiener filtering , 2002 .

[18]  Md. Kamrul Hasan,et al.  MMSE estimator for speech enhancement considering the constructive and destructive interference of noise , 2010 .

[19]  Pascal Scalart,et al.  Improved Signal-to-Noise Ratio Estimation for Speech Enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Jacob Benesty,et al.  Speech Enhancement , 2010 .

[21]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[22]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[23]  Raveendran Paramesran,et al.  Image analysis by Krawtchouk moments , 2003, IEEE Trans. Image Process..

[24]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[25]  Yi Hu,et al.  Incorporating a psychoacoustical model in frequency domain speech enhancement , 2004, IEEE Signal Processing Letters.

[26]  J. C. Steinberg,et al.  Factors Governing the Intelligibility of Speech Sounds , 1945 .

[27]  Philipos C. Loizou,et al.  Speech Quality Assessment , 2011, Multimedia Analysis, Processing and Communications.

[28]  IEEE Recommended Practice for Speech Quality Measurements , 1969, IEEE Transactions on Audio and Electroacoustics.

[29]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[30]  V. B. Uvarov,et al.  Classical Orthogonal Polynomials of a Discrete Variable , 1991 .

[31]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[32]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[33]  Nur Azman Abu,et al.  Fast Dynamic Speech Recognition via Discrete Tchebichef Transform , 2011, 2011 First International Conference on Informatics and Computational Intelligence.

[34]  R. Mukundan,et al.  A Fast 4 $\times$ 4 Forward Discrete Tchebichef Transform Algorithm , 2007, IEEE Signal Processing Letters.

[35]  D. L. Richards Speech-transmission performance of p.c.m. systems , 1965 .

[36]  Yi Hu,et al.  Evaluation of objective measures for speech enhancement , 2006, INTERSPEECH.

[37]  William D. Voiers,et al.  A Comparison of Subjective Methods for Evaluating Speech Quality , 1993 .

[38]  Soo Ngee Koh,et al.  Noisy speech enhancement using discrete cosine transform , 1998, Speech Commun..