Low‐Bit‐Rate Speech Coding

This article is focused on speech coding methods for achieving communication quality speech at bit rates of 4 kbit/s and lower. The speech coding techniques are based on an all-pole model of the vocal tract which may be implemented in the time domain with appropriately selected excitation functions or else may be fit to a spectral analysis of the speech signal. Three main types of coders are described below. Code-excited linear prediction (CELP) coders select their excitation from waveform codebooks using analysis-by-synthesis closed-loop techniques, which need to be supplemented by speech classification and open-loop parametric techniques for keeping up with quality at lower rates. The prototypical sinusoidal coder (SC) has a bank of oscillators for signal synthesis, driven by a model of the magnitude spectrum. However, phase regeneration is important in enhancing speech reconstruction at low rates. Waveform interpolation (WI) coders afford a wider time-frequency footprint for the representation of the excitation, showing a good potential for achieving toll quality at bit rates below 4 kbit/s. Keywords: low bit rate speech coding; vocoder; codec; rate-distortion function; code-excited linear prediction; CELP; algebraic CELP; ACELP; linear prediction; LP; linear predictive coding; LPC; sinusoidal coder; waveform interpolation; WI; complexity; bit rate; fidelity; distortion; speech synthesis

[1]  S. Hayashi,et al.  Design and description of CS-ACELP: a toll quality 8 kb/s speech coder , 1998, IEEE Trans. Speech Audio Process..

[2]  Aníbal R. Figueiras-Vidal,et al.  On the behaviour of reduced complexity code-excited linear prediction (CELP) , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Miguel Arjona Ramírez Sparsity compensation for speech coders , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[4]  Andreas Spanias Speech coding standards , 2001 .

[5]  P. Kroon,et al.  Generalized analysis-by-synthesis coding and its application to pitch prediction , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Masami Akamine,et al.  CELP speech coding based on an adaptive pulse position codebook , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[7]  P. Mabilleau,et al.  16 kbps wideband speech coding technique based on algebraic CELP , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[8]  K. Evans,et al.  Characterizing the subjective performance of the ITU-T 8 kb/s speech coding algorithm-ITU-T G.729 , 1997 .

[9]  Isabel Trancoso,et al.  Efficient procedures for finding the optimum innovation in stochastic coders , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Ryan Heidari,et al.  Improving EVRC half rate by the algebraic VQ-CELP , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[11]  R. V. Cox,et al.  Low bit-rate speech coders for multimedia communication , 1996, IEEE Commun. Mag..

[12]  W. Bastiaan Kleijn,et al.  Fast methods for the CELP speech coding algorithm , 1990, IEEE Trans. Acoust. Speech Signal Process..

[13]  Takehiro Moriya,et al.  Design of a Pitch Synchronous Innovation CELP Coder for Mobile Communications , 1995, IEEE J. Sel. Areas Commun..

[14]  S. Dimolitsas,et al.  Current objectives in 4-kb/s wireline-quality speech coding standardization , 1994, IEEE Signal Processing Letters.

[15]  Bishnu S. Atal,et al.  A new model of LPC excitation for producing natural-sounding speech at low bit rates , 1982, ICASSP.

[16]  Redwan Salami,et al.  GSM enhanced full rate speech codec , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  Kuldip K. Paliwal,et al.  An Introduction to Speech Coding , 1995 .

[18]  Thomas P. Barnwell,et al.  A 2.4 kbit/s MELP coder candidate for the new U.S. Federal Standard , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[19]  Andreas Spanias,et al.  Speech coding: a tutorial review , 1994, Proc. IEEE.

[20]  Allen Gersho,et al.  Enhanced waveform interpolative coding at low bit-rate , 2001, IEEE Trans. Speech Audio Process..

[21]  Allen Gersho,et al.  A novel approach to excitation coding in low-bit-rate high-quality CELP coders , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[22]  Yang Gao,et al.  A candidate for the ITU-T 4 kbit/s speech coding standard , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[23]  Manfred R. Schroeder,et al.  Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[24]  Jean-Pierre Adoul,et al.  Enhanced full rate speech codec for IS-136 digital cellular system , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[25]  Max Gerken,et al.  Joint position and amplitude search of algebraic multipulses , 2000, IEEE Trans. Speech Audio Process..

[26]  Hochong Park Efficient codebook search method of EVRC speech codec , 2000 .