High-quality digital speech at 4 kb/s

A speech coder based on a single-pulse excitation code-excited linear predictive coding (SPE-CELP) model of linear-predictive coding (LPC) is proposed. An algorithm for determining the time instants of pitch periods within a short interval of periodic speech, which results in a time sequence of marker points that indicate the beginning of the pitch periods in the analyzed speech interval, is described. The LPC excitation is generated by a stochastic codebook for nonperiodic speech and by a single pulse per pitch period for periodic speech. The proper alignment of the excitation pulse is efficiently computed using dynamic programming. It is concluded that, at overall bit rates of around 3 kb/s, the coder produces significantly better speech quality than LPC10E, though the synthesized speech still sounds slightly buzzy for certain speakers.<<ETX>>

[1]  M. Sondhi,et al.  New methods of pitch extraction , 1968 .

[2]  Ronald W. Schafer,et al.  Real-time digital hardware pitch detector , 1976 .

[3]  Bishnu S. Atal,et al.  Beyond Multipulse and CELP Towards High Quality Speech at 4 Kb/s , 1991 .

[4]  B. Atal,et al.  Strategies for improving the performance of CELP coders at low bit rates (speech analysis) , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  Manfred R. Schroeder,et al.  Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Willem Bastiaan Kleijn,et al.  Improved speech quality and efficient vector quantization in SELP , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7]  David Talkin,et al.  Voicing epoch determination with dynamic programming , 1989 .

[8]  K. Paliwal,et al.  Efficient vector quantization of LPC parameters at 24 bits/frame , 1990 .

[9]  B.S. Atal,et al.  Efficient search procedures for selecting the optimum innovation in stochastic coders , 1990, IEEE Trans. Acoust. Speech Signal Process..