Speech Pre-Processing for Pitch and Pitch-Cylce Evolutions Smoothing

In low bit rate speech coders, pitch is usually transmitted once per frame and, when needed, the intermediate pitch values are obtained by interpolation between 2 adjacent pitch values. Although pitch usually evolves slowly, sometimes it has irregular variations and the estimated pitch differs from the real one. In addition, some speech coders, e.g., waveform interpolation coders, rely on smooth pitchcycle evolutions to extract speech model parameters in the analysis stage. However, non-stationary characteristics of speech may lead to inaccurate estimation of the parameters. This affects the synthesised speech quality. We propose a pre-processor, which modifies the residual speech signal to provide smooth pitch variations and pitch-cycle evolutions, without distorting perceptual speech quality. Thus, the pitch and the voicing level can be more accurately determined.

[1]  Ahmet M. Kondoz,et al.  Digital Speech: Coding for Low Bit Rate Communication Systems , 1995 .

[2]  John S. Collura,et al.  MELP: the new Federal Standard at 2400 bps , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Douglas D. O'Shaughnessy,et al.  Automatic and reliable estimation of glottal closure instant and period , 1989, IEEE Trans. Acoust. Speech Signal Process..

[4]  Thomas Eriksson,et al.  On waveform-interpolation coding with asymptotically perfect reconstruction , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[5]  Hassan Farsi Advanced pre-and-post processing techniques for speech coding , 2003 .

[6]  William H. Press,et al.  Numerical recipes in C , 2002 .

[7]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .