Low Delay Coding of Wideband Speech at 32 Kbps Using Tree Structures

The prospect of high-quality commentary-grade multi-channel/multi-user speech communication via the emerging ISDN has raised a lot of interest in advanced coding algorithms for 50-7000 Hz wideband speech. A high-quality 32Kbps wideband speech coder has recently been developed in our laboratory [1,2]. This coder is based on the Low-Delay Code-Excited Linear-Predictive (LD-CELP) algorithm. It employs 5-sample vector quantization (VQ) with an end-to-end delay of only about 0.94 msec. Its performance, as judged by informal listening tests, is comparable to that of the 64Kbps standard (G.722) CCITT wideband coder [3]. Since a much longer delay can be tolerated in many (if not all) wideband-speech applications [4], it is possible, in principle to further improve the performance by increasing the frame size and the coding delay. A straightforward extension of the frame size, however, implies an exponential increase of coding complexity that is characteristic of VQ-based algorithms.

[1]  P. Mermelstein G.722: a new CCITT coding standard for digital transmission of wideband audio signals , 1988, IEEE Communications Magazine.

[2]  John B. Anderson,et al.  Sequential Coding Algorithms: A Survey and Cost Analysis , 1984, IEEE Trans. Commun..

[3]  Thomas P. Barnwell,et al.  Recursive windowing for generating autocorrelation coefficients for LPC analysis , 1981 .

[4]  Yair Shoham,et al.  Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  John B. Anderson,et al.  Tree encoding of speech , 1975, IEEE Trans. Inf. Theory.