A bit-rate/bandwidth scalable speech coder based on ITU-T G.723.1 standard

The paper presents a new scalable coder based on the ITU-T G.723.1 standard which is one of the most famous speech coders for VoIP applications. In order to support both bit-rate scalability and bandwidth scalability, the proposed coder adopts a split-band approach, where the input signal, sampled at 16 kHz, is decomposed into two equal frequency bands. The lower-band speech is coded with a standard coder, such as the G.723.1 standard. In addition, the low-band enhancement layer for lower-band speech improves the perceptual quality of decoded speech by employing additional coding units based on a cascaded codebook approach. The higher-band signal is encoded using an MDCT-based transform coding scheme. The proposed coder, at a bit-rate of 19.4 kbit/s, provides speech quality comparable to the ITU-T 24 kbit/s G.722.1 coder, while it also has interoperability with G.723.1.

[1]  Young-Cheol Park,et al.  A new bandwidth scalable wideband speech/audio coder , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  B. Achiriloaie,et al.  VI REFERENCES , 1961 .

[3]  Allen Gersho,et al.  A 16-kbit/s bandwidth scalable audio coder based on the G.729 standard , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4]  Robert Bregovic,et al.  Multirate Systems and Filter Banks , 2002 .

[5]  Henrique S. Malvar,et al.  Signal processing with lapped transforms , 1992 .

[6]  Akihiko Sugiyama,et al.  A 128 kb/s Hi-Fi Audio CODEC Based on Adaptive Transform Coding with Adaptive Block Size MDCT , 1992, IEEE J. Sel. Areas Commun..

[7]  Hong-Goo Kang,et al.  A cascaded algebraic codebook structure to improve the performance of speech coder , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8]  Alan McCree,et al.  A 14 kb/s wideband speech coder with a parametric highband model , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  John Princen,et al.  Analysis/Synthesis filter bank design based on time domain aliasing cancellation , 1986, IEEE Trans. Acoust. Speech Signal Process..