Perceptually Motivated Sub-band Decomposition for FDLP Audio Coding

This paper describes employment of non-uniform QMF decomposition to increase the efficiency of a generic wide-band audio coding system based on Frequency Domain Linear Prediction (FDLP). The base line FDLP codec, operating at high bit-rates (~136 kbps), exploits a uniform QMF decomposition into 64 sub-bands followed by sub-band processing based on FDLP. Here, we propose a non-uniform QMF decomposition into 32 frequency sub-bands obtained by merging 64 uniform QMF bands. The merging operation is performed in such a way that bandwidths of the resulting critically sampled sub-bands emulate the characteristics of the critical band filters in the human auditory system. Such frequency decomposition, when employed in the FDLP audio codec, results in a bit-rate reduction of 40% over the base line. We also describe the complete audio codec, which provides high-fidelity audio compression at ~66 kbps. In subjective listening tests, the FDLP codec outperforms MPEG-1 Layer 3 (MP3) and achieves similar qualities as MPEG-4 HE-AAC codec.

[1]  Robert Bregovic,et al.  Multirate Systems and Filter Banks , 2002 .

[2]  Hynek Hermansky,et al.  Scalable Wide-band Audio Codec based on Frequency Domain Linear Prediction , 2007 .

[3]  Daniel P. W. Ellis,et al.  Frequency-domain linear prediction for temporal features , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[4]  A. Charbonnier,et al.  Design of nearly perfect non uniform QMF filter banks , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  Petr Motlícek,et al.  Speech Coding Based on Spectral Dynamics , 2006, TSD.

[6]  Günther Theile,et al.  Low-Bit Rate Coding of High Quality Audio Signals , 1987 .

[7]  James David Johnston,et al.  Enhancing the Performance of Perceptual Audio Coders by Using Temporal Noise Shaping (TNS) , 1996 .

[8]  Davis Pan,et al.  A Tutorial on MPEG/Audio Compression , 1995, IEEE Multim..

[9]  Petr Motlícek,et al.  Temporal masking for bit-rate reduction in audio codec based on Frequency Domain Linear Prediction , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Louis Dunn Fielder,et al.  ISO/IEC MPEG-2 Advanced Audio Coding , 1997 .

[11]  Kristofer Kjörling,et al.  Spectral Band Replication, a Novel Approach in Audio Coding , 2002 .

[12]  Thomas Sporer,et al.  PEAQ - The ITU Standard for Objective Measurement of Perceived Audio Quality , 2000 .

[13]  Petr Motlícek,et al.  Wide-Band Perceptual Audio Coding Based on Frequency-Domain Linear Prediction , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.