On methods for perfect reconstruction WI speech coding with preprocessing

The waveform interpolation (WI) speech coding algorithm has been shown to be an efficient method to describe the evolution of periodic voiced components in the speech signal. However, the conventional WI coding does not provide perfect reconstruction property, i.e. the decoded signal does not converge to the original signal with decreasing quantization error. Therefore errors in the coding model cannot be fixed by quantization. In this paper we discuss about characteristics of the WI coding model and about modifications to the model which enable the perfect reconstruction property. The new requirements and features are examined and discussed in detail. While the perfect reconstruction property brings many benefits it also causes new demands to the operation of the coder. Particularly high requirements are set to the exactness of the pitch estimate; inaccuracies hamper rapidly the possibilities to quantize the parameters efficiently. To overcome this we introduce a preprocessing method which slightly modifies the pitch structure of the residual signal before waveform extraction. The modifications to the signal are minor and therefore the quality of the preprocessed signal is very close to that of the input speech. In the proposed method the perfect reconstruction property is maintained in relation to the preprocessed signal.

[1]  Michael Unser,et al.  Splines: a perfect fit for signal and image processing , 1999, IEEE Signal Process. Mag..

[2]  Ed F. Deprettere,et al.  Waveform interpolation coding with pitch-spaced subbands , 1998, ICSLP.

[3]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals , 1983 .

[4]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .

[5]  Thomas Eriksson,et al.  On waveform-interpolation coding with asymptotically perfect reconstruction , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[6]  Michael Unser,et al.  B-spline signal processing. I. Theory , 1993, IEEE Trans. Signal Process..

[7]  Wolfgang Hess Manual and Instrumental Pitch Determination, Voicing Determination , 1983 .

[8]  J. Saarinen,et al.  Coding distortion caused by a phase difference between the LP filter and its residual , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[9]  Eddie L. T. Choy,et al.  Waveform Interpolation Speech Coder at 4 kb/s , 1998 .

[10]  Akram Aldroubi,et al.  B-spline signal processing. II. Efficiency design and applications , 1993, IEEE Trans. Signal Process..

[11]  W. Bastiaan Kleijn,et al.  The RCELP speech-coding algorithm , 2010, Eur. Trans. Telecommun..

[12]  Jukka Saarinen,et al.  Waveform extraction for perfect reconstruction in WI coding , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[13]  Manfred R. Schroeder,et al.  Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Akram Aldroubi,et al.  B-SPLINE SIGNAL PROCESSING: PART II-EFFICIENT DESIGN AND APPLICATIONS , 1993 .

[15]  Michael Unser,et al.  Splines: A perfect fit for signal processing , 2000, 2000 10th European Signal Processing Conference.

[16]  W. Bastiaan Kleijn,et al.  Interpolation of the pitch-predictor parameters in analysis-by-synthesis speech coders , 1994, IEEE Trans. Speech Audio Process..

[17]  Lawrence R. Rabiner,et al.  On the use of autocorrelation analysis for pitch detection , 1977 .

[18]  K. H. Barratt Digital Coding of Waveforms , 1985 .

[19]  Ian Burnett,et al.  Adapting waveform interpolation (with pitch-spaced subbands) for quantisation , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[20]  B.S. Atal,et al.  Efficient search procedures for selecting the optimum innovation in stochastic coders , 1990, IEEE Trans. Acoust. Speech Signal Process..

[21]  Ed F. Deprettere,et al.  Pitch synchronous modulated lapped transform of the linear prediction residual of speech , 1998, ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344).

[22]  Akram Aldroubi,et al.  B-SPLINE SIGNAL PROCESSING: PART I-THEORY , 1993 .

[23]  Jukka Saarinen,et al.  A preprocessing method for perfect reconstruction WI coding , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[24]  Peter No,et al.  Digital Coding of Waveforms , 1986 .