Scalable and progressive audio codec

A source coding technique for variable, bandwidth-constrained channels such as the Internet must do two things: offer high quality at low data rates, and adapt gracefully to changes in available bandwidth. Here we propose an audio coding algorithm that is superior on both counts. It is inherently scalable, meaning that channel conditions can be matched without the need for additional computation. Moreover, it is compact: in subjective tests our algorithm, coded at 32 kb/s/channel, outperformed MPEG-1 Layer 3 (MP3) coded at 56 kb/s/channel (both at 44.1 kHz). We achieve this simultaneous increase in compression and scalability through use of a two-dimensional transform that concentrates relevant information into a small number of coefficients.

[1]  Davis Pan,et al.  A Tutorial on MPEG/Audio Compression , 1995, IEEE Multim..

[2]  Henrique S. Malvar Enhancing the performance of subband audio coders for speech signals , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[3]  Takehiro Moriya,et al.  A design of transform coder for both speech and audio signals at 1 bit/sample , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Yoshinori Tanaka,et al.  Low-bit-rate speech coding using a two-dimensional transform of residual signals and waveform interpolation , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Louis Dunn Fielder,et al.  AC-3: Flexible Perceptual Coding for Audio Transmission and Storage , 1994 .

[6]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[7]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[8]  S. Shamma,et al.  Analysis of dynamic spectra in ferret primary auditory cortex. II. Prediction of unit responses to arbitrary dynamic spectra. , 1996, Journal of neurophysiology.

[9]  Peter Monta,et al.  Low rate audio coder with hierarchical filterbanks and lattice vector quantization , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  S. Shamma,et al.  Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra. , 1996, Journal of neurophysiology.

[11]  John Princen,et al.  Analysis/Synthesis filter bank design based on time domain aliasing cancellation , 1986, IEEE Trans. Acoust. Speech Signal Process..