Low noise reversible MDCT (RMDCT) and its application in progressive-to-lossless embedded audio coding

A reversible transform converts an integer input to an integer output, while retaining the ability to reconstruct the exact input from the output sequence. It is one of the key components for lossless and progressive-to-lossless audio codecs. In this work, we investigate the desired characteristics of a high-performance reversible transform. Specifically, we show that the smaller the quantization noise of the reversible modified discrete cosine transform (RMDCT), the better the compression performance of the lossless and progressive-to-lossless codec that utilizes the transform. Armed with this knowledge, we develop a number of RMDCT solutions. The first RMDCT solution is implemented by turning every rotation module of a float MDCT (FMDCT) into a reversible rotation, which uses multiple factorizations to further reduce the quantization noise. The second and third solutions use the matrix lifting to implement a reversible fast Fourier transform (FFT) and a reversible fractional-shifted FFT, respectively, which are further combined with the reversible rotations to form the RMDCT. With the matrix lifting, we can design the RMDCT that has less quantization noise and can still be computed efficiently. A progressive-to-lossless embedded audio codec (PLEAC) employing the RMDCT is implemented with superior results for both lossless and lossy audio compression.

[1]  S. Geneva,et al.  Sound Quality Assessment Material: Recordings for Subjective Tests , 1988 .

[2]  Michael W. Marcellin,et al.  JPEG2000 - image compression fundamentals, standards and practice , 2002, The Kluwer International Series in Engineering and Computer Science.

[3]  Henrique S. Malvar Lapped transforms for efficient transform/subband coding , 1990, IEEE Trans. Acoust. Speech Signal Process..

[4]  Takehiro Moriya,et al.  Lossless scalable audio coder and quality enhancement , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Jürgen Herre,et al.  IntMDCT - A link between perceptual and lossless audio coding , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Jin Li,et al.  Embedded audio coding (EAC) with implicit auditory masking , 2002, MULTIMEDIA '02.

[7]  Songyu Yu,et al.  1-D and 2-D transforms from integers to integers , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8]  Kaoru Sezaki,et al.  Reversible discrete cosine transform , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[9]  William H. Press,et al.  Numerical recipes in C , 2002 .

[10]  Marcus Purat,et al.  Lossless Transform Coding of Audio Signals , 1997 .

[11]  Gerald Schuller,et al.  Improved integer transforms for lossless audio coding , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.