Sparse linear predictors for speech processing

This paper presents two new classes of linear prediction schemes. The first one is based on the concept of creating a sparse residual rather than a minimum variance one, which will allow a more efficient quantization; we will show that this works well in presence of voiced speech, where the excitation can be represented by an impulse train, and creates a sparser residual in the case of unvoiced speech. The second class aims at finding sparse prediction coefficients; interesting results can be seen applying it to the joint estimation of long-term and short-term predictors. The proposed estimators are all solutions to convex optimization problems, which can be solved efficiently and reliably using, e.g., interior-point methods.

[1]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[2]  Etienne Denoel,et al.  Linear prediction of speech with a least absolute error criterion , 1985, IEEE Trans. Acoust. Speech Signal Process..

[3]  F. Riera-Palou,et al.  A hybrid parametric-waveform approach to bit stream scalable audio coding , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[4]  Yuanqing Li,et al.  Analysis of Sparse Representation and Blind Source Separation , 2004, Neural Computation.

[5]  Paul Mermelstein,et al.  Joint optimization of short-term and long-term predictors in CELP speech coders , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  Wai C. Chu,et al.  Speech Coding Algorithms: Foundation and Evolution of Standardized Coders , 2003 .

[7]  Subhash C. NarulaI,et al.  The Minimum Sum of Absolute Errors Regression: A State of the Art Survey , 1982 .

[8]  Peter Kabal,et al.  Joint optimization of linear predictors in speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[11]  Bishnu S. Atal,et al.  A new model of LPC excitation for producing natural-sounding speech at low bit rates , 1982, ICASSP.

[12]  Paul Mermelstein,et al.  Joint optimization of short-term and long-term predictors in CELP speech coders , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[13]  Ed F. Deprettere,et al.  Regular-pulse excitation-A novel approach to effective and efficient multipulse coding of speech , 1986, IEEE Trans. Acoust. Speech Signal Process..

[14]  M. Reed,et al.  Methods of Modern Mathematical Physics. 2. Fourier Analysis, Self-adjointness , 1975 .

[15]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[16]  Jens-Rainer Ohm Multimedia Communication Technology: Representation,Transmission and Identification of Multimedia Signals , 2004 .

[17]  Stephen J. Wright,et al.  Primal-Dual Interior-Point Methods , 1997 .

[18]  Petre Stoica,et al.  Spectral Analysis of Signals , 2009 .