Modulation frequency and efficient audio coding

The concept of modulation frequency is shown to be a valuable insight into time-frequency transforms for audio coding. A two-dimensional transform, where the second dimension approximately decomposes the audio signal into modulation frequencies, is proposed. This transform, when applied to audio coding, provides high quality at low data rates and adapt gracefully to changes in available bandwidth. It is inherently scalable, meaning that channel conditions can be matched without the need for additional computation. Moreover, it is compact: in subjective tests our algorithm, coded at 32 kilobits/seconds/channel, outperformed MPEG-1 Layer 3 (MP3) coded at 56 kilobits/seconds/channel (both at 44.1 kHz). This potentially useful result motivates the need for further insight into the definition and analysis of modulation frequency. We thus define modulation frequency for a simple narrowband signal, propose a general bilinear framework for detection, and then propose a minimal set of conditions to extend this definition to broadband signals such as audio.

[1]  W. Gardner Exploitation of spectral redundancy in cyclostationary signals , 1991, IEEE Signal Processing Magazine.

[2]  Lompression Davis Pan A Tutorial on MPEG/ Audio , 1995 .

[3]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[4]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[5]  Steven Greenberg,et al.  The modulation spectrogram: in pursuit of an invariant representation of speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Peter Monta,et al.  Low rate audio coder with hierarchical filterbanks and lattice vector quantization , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Les E. Atlas,et al.  Scalable and progressive audio codec , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[8]  Davis Pan,et al.  A Tutorial on MPEG/Audio Compression , 1995, IEEE Multim..

[9]  Henrique S. Malvar Enhancing the performance of subband audio coders for speech signals , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[10]  S. Shamma,et al.  Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra. , 1996, Journal of neurophysiology.

[11]  Takehiro Moriya,et al.  A design of transform coder for both speech and audio signals at 1 bit/sample , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Yoshinori Tanaka,et al.  Low-bit-rate speech coding using a two-dimensional transform of residual signals and waveform interpolation , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.