A Pre-processing Method to Modify Irregular Pitch Variations for Quality Enhancement of Synthesised Speech

In low bit rate speech coders, pitch and voicing level estimation play an important role in quality of the synthesised speech. Although pitch usually evolves smoothly, sometimes it has irregular variations and as a result the estimated pitch and the voicing level differ from the real ones. This affects the performance of the speech coder. We propose to use a new modification as a preprocessor. This methodology modifies the residual speech signal such that the pitch period evolves more smoothly without distorting perceptual speech quality. Thus, the pitch and the voicing level can be determined correctly. Experimental results show that combination of the proposed method with 2.4 Kb/s MELP coder provides better quality.

[1]  Thomas Eriksson,et al.  On waveform-interpolation coding with asymptotically perfect reconstruction , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[2]  M. Boudraa,et al.  Multiple descriptions coding in MELP coder for voice over IP , 2012, International Multi-Conference on Systems, Sygnals & Devices.

[3]  Bayya Yegnanarayana,et al.  Determination of instants of significant excitation in speech using group delay function , 1995, IEEE Trans. Speech Audio Process..

[4]  Augustus J. E. M. Janssen,et al.  A time warper for speech signals , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[5]  Douglas D. O'Shaughnessy,et al.  Automatic and reliable estimation of glottal closure instant and period , 1989, IEEE Trans. Acoust. Speech Signal Process..

[6]  Jae S. Lim,et al.  Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.

[7]  Vladimir Cuperman,et al.  Robust voicing estimation with dynamic time warping , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  Hong-Goo Kang,et al.  Waveform Interpolation-Based Speech Analysis/Synthesis for HMM-Based TTS Systems , 2012, IEEE Signal Processing Letters.

[9]  Qiang Li,et al.  A design of 0.9kb/s speech coding algorithm based on MELPe , 2012, 2012 5th International Congress on Image and Signal Processing.

[10]  Thomas P. Barnwell,et al.  MCCREE AND BARNWELL MIXED EXCITAmON LPC VOCODER MODEL LPC SYNTHESIS FILTER 243 SYNTHESIZED SPEECH-PERIODIC PULSE TRAIN-1 PERIODIC POSITION JITTER PULSE 4 , 2004 .

[11]  Ahmet M. Kondoz,et al.  Digital Speech: Coding for Low Bit Rate Communication Systems , 1995 .

[12]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .