Speech enhancement based on wavelet packet of an improved principal component analysis

Integrating the principal component analysis in wavelet packet decomposition.Extended PCA technique for speech enhancement is considered.To obtain a sparse matrix to contain the enhanced speech.Experiments on NOIZEUS data corrupted by Gaussian and four non-stationary noises.Our approach shows superior outcomes in BSS EVAL toolbox, SegSNR, PESQ, and Cov. In this paper, we propose a single-channel speech enhancement method, based on the combination of the wavelet packet transform and an improved version of the principal component analysis (PCA). Our method integrates ability of PCA to de-correlate the coefficients by extracting a linear relationship with what of wavelet packet analysis to derive feature vectors used for speech enhancement. This allows us to operate with a convenient shrinkage function on these new coefficients, removing the noise without degrading the speech. Then, the enhanced speech obtained by the inverse wavelet packet transform is decomposed into three subspaces: low rank, sparse, and the remainder noise components. Finally, we calculate the components as a segregation problem. The performance evaluation shows that our method provides a higher noise reduction and a lower signal distortion even in highly noisy conditions without introducing artifacts.

[1]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[2]  I. Jolliffe Principal Component Analysis , 2002 .

[3]  Alexander A. Petrovsky,et al.  Warped DFT Based Perceptual Noise Reduction System , 2004 .

[4]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[5]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[6]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[7]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[8]  Heiga Zen,et al.  Applying Sparse KPCA for Feature Extraction in Speech Recognition , 2005, IEICE Trans. Inf. Syst..

[9]  I. Johnstone,et al.  Wavelet Shrinkage: Asymptopia? , 1995 .

[10]  Yi Hu,et al.  A generalized subspace approach for enhancing speech corrupted by colored noise , 2003, IEEE Trans. Speech Audio Process..

[11]  Keikichi Hirose,et al.  Single-Mixture Audio Source Separation by Subspace Decomposition of Hilbert Spectrum , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Joachim M. Buhmann,et al.  Speech enhancement with sparse coding in learned dictionaries , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Victor Vianu,et al.  Invited articles section foreword , 2010, JACM.

[14]  Björn W. Schuller,et al.  Real-Time Speech Separation by Semi-supervised Nonnegative Matrix Factorization , 2012, LVA/ICA.

[15]  Douglas D. O'Shaughnessy,et al.  Blind speech separation for convolutive mixtures using an oriented principal components analysis method , 2010, 2010 18th European Signal Processing Conference.

[16]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[17]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[18]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[19]  Jacob Benesty,et al.  Speech Enhancement Using Principal Component Analysis and Variance of the Reconstruction Error Model Identification A thesis Presented for the Master's Degree of Telecommunications , 2008 .

[20]  H.K. Kwan,et al.  Adaptive subband Wiener filtering for speech enhancement using critical-band gammatone filterbank , 2005, 48th Midwest Symposium on Circuits and Systems, 2005..

[21]  S. Qin,et al.  Determining the number of principal components for best reconstruction , 2000 .

[22]  Heiga Zen,et al.  On the Use of Kernel PCA for Feature Extraction in Speech Recognition , 2003, IEICE Trans. Inf. Syst..

[23]  H. Kaiser The Application of Electronic Computers to Factor Analysis , 1960 .

[24]  Jean-Marc Vesin,et al.  Single channel speech enhancement using principal component analysis and MDL subspace selection , 1999, EUROSPEECH.

[25]  Yang Lu,et al.  A geometric approach to spectral subtraction , 2008, Speech Commun..

[26]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[27]  Y. Ephraim,et al.  Extension of the signal subspace speech enhancement approach to colored noise , 2003, IEEE Signal Processing Letters.

[28]  W. Velicer,et al.  Comparison of five rules for determining the number of components to retain. , 1986 .

[29]  Arne Leijon,et al.  A new linear MMSE filter for single channel speech enhancement based on Nonnegative Matrix Factorization , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[30]  Lin-Shan Lee,et al.  Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[31]  Yi Hu,et al.  Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..

[32]  Dacheng Tao,et al.  GoDec: Randomized Lowrank & Sparse Matrix Decomposition in Noisy Case , 2011, ICML.

[33]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[34]  George Carayannis,et al.  Speech enhancement from noise: A regenerative approach , 1991, Speech Commun..

[35]  Paris Smaragdis,et al.  A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[36]  Gernot Kubin,et al.  Kernel PCA for Speech Enhancement , 2011, INTERSPEECH.

[37]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[38]  J. C. Rutledge,et al.  Reducing correlated noise in digital hearing aids , 1996 .

[39]  Benoît Champagne,et al.  Incorporating the human hearing properties in the signal subspace approach for speech enhancement , 2003, IEEE Trans. Speech Audio Process..

[40]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[41]  Yasser Ghanbari,et al.  A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets , 2006, Speech Commun..

[42]  Douglas D. O'Shaughnessy,et al.  Speech enhancement using PCA and variance of the reconstruction error model identification , 2007, INTERSPEECH.

[43]  David Leporini,et al.  Bayesian wavelet denoising: Besov priors and non-Gaussian noises , 2001, Signal Process..

[44]  Yi Hu,et al.  Speech enhancement based on wavelet thresholding the multitaper spectrum , 2004, IEEE Transactions on Speech and Audio Processing.

[45]  Saeed Gazor,et al.  An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..

[46]  Jorma Rissanen,et al.  MDL Denoising , 2000, IEEE Trans. Inf. Theory.

[47]  Tetsuya Takiguchi,et al.  PCA-Based Speech Enhancement for Distorted Speech Recognition , 2007, J. Multim..

[48]  H. Lou,et al.  An approach based on simplified KLT and wavelet transform for enhancing speech degraded by non-stationary wideband noise , 2003 .

[49]  Chiung-Wen Li,et al.  Signal subspace approach for speech enhancement in nonstationary noises , 2007, 2007 International Symposium on Communications and Information Technologies.

[50]  Richard I. Shrager,et al.  Titration of individual components in a mixture with resolution of difference spectra, pKs, and redox transitions , 1982 .