论文信息 - Low Delay Coding of Wideband Speech at 32 Kbps Using Tree Structures

Low Delay Coding of Wideband Speech at 32 Kbps Using Tree Structures

The prospect of high-quality commentary-grade multi-channel/multi-user speech communication via the emerging ISDN has raised a lot of interest in advanced coding algorithms for 50-7000 Hz wideband speech. A high-quality 32Kbps wideband speech coder has recently been developed in our laboratory [1,2]. This coder is based on the Low-Delay Code-Excited Linear-Predictive (LD-CELP) algorithm. It employs 5-sample vector quantization (VQ) with an end-to-end delay of only about 0.94 msec. Its performance, as judged by informal listening tests, is comparable to that of the 64Kbps standard (G.722) CCITT wideband coder [3]. Since a much longer delay can be tolerated in many (if not all) wideband-speech applications [4], it is possible, in principle to further improve the performance by increasing the frame size and the coding delay. A straightforward extension of the frame size, however, implies an exponential increase of coding complexity that is characteristic of VQ-based algorithms.

Yair Shoham

[1] P. Mermelstein. G.722: a new CCITT coding standard for digital transmission of wideband audio signals , 1988, IEEE Communications Magazine.

[2] John B. Anderson,et al. Sequential Coding Algorithms: A Survey and Cost Analysis , 1984, IEEE Trans. Commun..

[3] Thomas P. Barnwell,et al. Recursive windowing for generating autocorrelation coefficients for LPC analysis , 1981 .

[4] Yair Shoham,et al. Low-delay code-excited linear-predictive coding of wideband speech at 32 kbps , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5] John B. Anderson,et al. Tree encoding of speech , 1975, IEEE Trans. Inf. Theory.