Speech enhancement using arch model

In this paper, we investigate the use of the autoregressive conditional heteroscedasticity (ARCH) model as a replacement to the decision-directed method in the log-spectral amplitude estimator for speech enhancement. We employ three sound quality measures: speech distortion, noise reduction and musical noise, and explain the effect the ARCH model parameters have on these measures. We demonstrate and compare the use of the decision-directed and ARCH estimators and show that the ARCH model achieves better results than the decision-directed for some of these measures, while compromising between the speech distortion and noise reduction.

[1]  Kiyohiro Shikano,et al.  Automatic optimization scheme of spectral subtraction based on musical noise assessment via higher-order statistics , 2008 .

[2]  Israel Cohen Modeling speech signals in the time-frequency domain using GARCH , 2004, Signal Process..

[3]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[4]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[5]  Simon J. Godsill,et al.  Efficient Alternatives to the Ephraim and Malah Suppression Rule for Audio Signal Enhancement , 2003, EURASIP J. Adv. Signal Process..

[6]  Kiyohiro Shikano,et al.  Theoretical Analysis of Musical Noise Generation in Noise Reduction Methods with Decision-Directed a Priori SNR Estimator , 2012, IWAENC.

[7]  Israel Cohen,et al.  Relaxed statistical model for speech enhancement and a priori SNR estimation , 2005, IEEE Transactions on Speech and Audio Processing.

[8]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[9]  Tim Fingscheidt,et al.  Black box measurement of musical tones produced by noise reduction systems , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[11]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..