Speech Reconstruction by Sparse Linear Prediction

This paper proposes a new variant of the least square autoregressive (LSAR) method for speech reconstruction, which can estimate via least squares a segment of missing samples by applying the linear prediction (LP) model of speech. First, we show that the use of a single high-order linear predictor can provide better results than the classic LSAR techniques based on short- and long-term predictors without the need of a pitch detector. However, this high-order predictor may reduce the reconstruction performance due to estimation errors, especially in the case of short pitch periods, and non-stationarity. In order to overcome these problems, we propose the use of a sparse linear predictor which resembles the classical speech model, based on short- and long-term correlations, where many LP coefficients are zero. The experimental results show the superiority of the proposed approach in both signal to noise ratio and perceptual performance.

[1]  Stephen P. Boyd,et al.  Semidefinite Programming , 1996, SIAM Rev..

[2]  Søren Holdt Jensen,et al.  Sequential Error Concealment for Video/Images by Weighted Template Matching , 2012, 2012 Data Compression Conference.

[3]  Saeed Vaseghi,et al.  Multimedia signal processing , 2010, 2010 International Conference on Signal Processing and Multimedia Applications (SIGMAP).

[4]  Marc Moonen,et al.  Speech coding based on sparse linear prediction , 2009, 2009 17th European Signal Processing Conference.

[5]  Luiz W. P. Biscainho,et al.  An efficient model-based multirate method for reconstruction of audio signals across long gaps , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  S. Vaseghi Detection and suppression of impulsive noise in speech communication systems , 1990 .

[7]  Marc Moonen,et al.  Sparse linear predictors for speech processing , 2008, INTERSPEECH.

[8]  Raymond N. J. Veldhuis,et al.  Adaptive interpolation of discrete-time signals that can be modeled as autoregressive processes , 1986, IEEE Trans. Acoust. Speech Signal Process..

[9]  J. Romberg,et al.  Imaging via Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[10]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[11]  Wu Chen Sparse Linear Prediction of Speech Code Based on the Compressed Sensing , 2011 .

[12]  Antonio Rubio,et al.  ALBAYZIN: a task-oriented spanish speech corpus , 1998 .

[13]  Per Hedelin,et al.  Packet loss concealment based on sinusoidal extrapolation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.