Scalable audio coder based on quantizer units of MDCT coefficients

A scalable codec has been constructed by using transform coding and the basic modules for scalable encoder and decoder. It allows users to choose a variety of scalable configurations in the frequency domain. The basic module is a quantizer that can quantize MDCT (modified DCT) coefficients transformed from a variety of frequency regions. This module mainly works at bit rates of more than 8 kbit/s. We can also change the target frequency regions of the basic module's input-output signals in each transform frame; i.e., we can change the scalable structure according to the nature of the input signals. In the scalable codec described here, the input-output signals are monaural and the sampling frequency is 24 kHz. The total bit rate of this scalable codec is more than 8 kbit/s. Subjective quality evaluation tests, mainly for musical sound sources, showed that it's sound quality is better than that of an MPEG-2 layer 3 codec at 8, 16, and 24 kbit/s when our scalable codec is constructed of 8-kbit/s basic modules. In combination with AAC (advanced audio coding), our scalable codec will be chosen as an international standard in ISO/IEC-MPEG-4/Audio.

[1]  Takehiro Moriya,et al.  A design of transform coder for both speech and audio signals at 1 bit/sample , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Takehiro Moriya,et al.  Extension and complexity reduction of TwinVQ audio coder , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[3]  Peter Doliwa MPEG-4 Advanced Audio Coding , .

[4]  Gerhard Stoll,et al.  ISO-MPEG-1 Audio: A Generic Standard for Coding of High-: Quality Digital Audio , 1994 .

[5]  John Princen,et al.  Subband/Transform coding using filter bank designs based on time domain aliasing cancellation , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Takehiro Moriya,et al.  High-quality audio-coding at less than 64 kbit/s by using transform-domain weighted interleave vector quantization (TwinVQ) , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Takehiro Moriya,et al.  Error-Protected TwinVQ Audio Coding at Less Than 64 kbit/s/ch , 1995, Proceedings. IEEE Workshop on Speech Coding for Telecommunications.