Coding Overcomplete Representations of Audio Using the MCLT

We propose a system for audio coding using the modulated complex lapped transform (MCLT). In general, it is difficult to encode signals using overcomplete representations without avoiding a penalty in rate-distortion performance. We show that the penalty can be significantly reduced for MCLT-based representations, without the need for iterative methods of sparsity reduction. We achieve that via a magnitude-phase polar quantization and the use of magnitude and phase prediction. Compared to systems based on quantization of orthogonal representations such as the modulated lapped transform (MLT), the new system allows for reduced warbling artifacts and more precise computation of frequency-domain auditory masking functions.

[1]  Stephen G. Wilson,et al.  Magnitude/Phase Quantization of Independent Gaussian Variates , 1980, IEEE Trans. Commun..

[2]  Henrique S. Malvar Adaptive run-length/Golomb-Rice encoding of quantized generalized Gaussian sources with unknown statistics , 2006, Data Compression Conference (DCC'06).

[3]  B. Moore An introduction to the psychology of hearing (5th ed.). , 1989 .

[4]  Mark B. Sandler,et al.  MDCT analysis of sinusoids: exact results and applications to coding artifacts reduction , 2004, IEEE Transactions on Speech and Audio Processing.

[5]  E. Owens Introduction to the Psychology of Hearing , 1977 .

[6]  Henrique S. Malvar A modulated complex lapped transform and its applications to audio processing , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[7]  Zixiang Xiong,et al.  Audio coding and image denoising based on the nonuniform modulated complex lapped transform , 2005, IEEE Transactions on Multimedia.

[8]  Seymour Shlien,et al.  The modulated lapped transform, its time-varying forms, and its applications to audio coding standards , 1997, IEEE Trans. Speech Audio Process..

[9]  Henrique S. Malvar,et al.  Signal processing with lapped transforms , 1992 .

[10]  Nick G. Kingsbury,et al.  Iterative image coding with overcomplete complex wavelet transforms , 2003, Visual Communications and Image Processing.

[11]  Mike E. Davies,et al.  Quantized Sparse Approximation with Iterative Thresholding for Audio Coding , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[12]  N. Kingsbury Complex Wavelets for Shift Invariant Analysis and Filtering of Signals , 2001 .

[13]  W. Bastiaan Kleijn,et al.  Entropy-constrained polar quantization and its application to audio coding , 2005, IEEE Transactions on Speech and Audio Processing.

[14]  Robert J. Safranek,et al.  Signal compression based on models of human perception , 1993, Proc. IEEE.

[15]  Mike E. Davies,et al.  Sparse audio representations using the MCLT , 2006, Signal Process..