Performance of nonlinear speech enhancement using phase space reconstruction

The paper presents the implementation of two nonlinear noise reduction methods applied to speech enhancement. The methods are based on embedding the noisy signal in a high-dimensional reconstructed phase space and applying singular value decomposition to project the signal into a lower dimension. The advantages of these nonlinear methods include that they do not require explicit models of noise spectra and do not have the typical "musical tone" side effects associated with traditional linear speech enhancement methods. The proposed nonlinear methods are compared with traditional speech enhancement techniques, including spectral subtraction, Wiener filtering, and Ephraim-Malah filtering, on example speech utterances with additive white noise for a variety of SNR levels. The results show that the local nonlinear noise reduction method outperforms Wiener filtering and spectral subtraction, but not Ephraim-Malah filtering, as had been suggested by previous studies.

[1]  H. Kantz,et al.  Noise reduction for human speech signals by local projections in embedding spaces , 2001 .

[2]  R Hegger,et al.  Denoising human speech signals using chaoslike features. , 2000, Physical review letters.

[3]  Steve McLaughlin,et al.  Speech characterization and synthesis by nonlinear methods , 1999, IEEE Trans. Speech Audio Process..

[4]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[5]  Schreiber,et al.  Noise reduction in chaotic time-series data: A survey of common methods. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[6]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[7]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[8]  Schreiber,et al.  Nonlinear noise reduction: A case study on experimental data. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[9]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[10]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[11]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[12]  Henry D. I. Abarbanel,et al.  Analysis of Observed Chaotic Data , 1995 .

[13]  Holger Kantz,et al.  Nonlinear Noise Reduction , 2002 .

[14]  H. Kantz,et al.  Nonlinear time series analysis , 1997 .

[15]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[16]  Holger Kantz,et al.  Practical implementation of nonlinear time series methods: The TISEAN package. , 1998, Chaos.

[17]  G. P. King,et al.  Extracting qualitative dynamics from experimental data , 1986 .

[18]  S. K. Mullick,et al.  NONLINEAR DYNAMICAL ANALYSIS OF SPEECH , 1996 .

[19]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .