论文信息 - Scalable Wide-band Audio Codec based on Frequency Domain Linear Prediction

Scalable Wide-band Audio Codec based on Frequency Domain Linear Prediction

This paper proposes a technique for wide-band audio applications based on the predictability of the temporal evolution of Quadrature Mirror Filter (QMF) sub-band signals. An input audio signal is first decomposed into 64 frequency sub-band signals using QMF decomposition. The temporal envelopes in critically sampled QMF sub-bands are approximated using frequency domain linear prediction applied over relatively long time segments (e.g. $1000$ ms). Line Spectral Frequency parameters related to autoregressive models are computed and quantized in each frequency sub-band. The sub-band residual signals are quantized in the frequency domain using a split Vector Quantization (VQ) technique. In the decoder, the sub-band signal is reconstructed using the quantized residual and the corresponding quantized envelope. Finally, application of inverse QMF reconstructs the audio signal. Even with simple quantization techniques and without any psychoacoustic model, the proposed audio coder provides encouraging results on objective quality tests.

[1] Hynek Hermansky,et al. Analysis and synthesis of speech based on spectral transform linear predictive method , 1983, ICASSP.

[2] Gerhard Stoll,et al. ISO-MPEG-1 Audio: A Generic Standard for Coding of High-: Quality Digital Audio , 1994 .

[3] Davis Pan,et al. A Tutorial on MPEG/Audio Compression , 1995, IEEE Multim..

[4] James David Johnston,et al. Enhancing the Performance of Perceptual Audio Coders by Using Temporal Noise Shaping (TNS) , 1996 .

[5] Louis Dunn Fielder,et al. ISO/IEC MPEG-2 Advanced Audio Coding , 1997 .

[6] Thomas Sporer,et al. PEAQ - The ITU Standard for Objective Measurement of Perceived Audio Quality , 2000 .

[7] Akihiko Sugiyama,et al. MPEG-4 natural audio coding , 2000, Signal Process. Image Commun..

[8] Kristofer Kjörling,et al. Spectral Band Replication, a Novel Approach in Audio Coding , 2002 .

[9] Daniel P. W. Ellis,et al. LP-TRAP: linear predictive temporal patterns , 2004, INTERSPEECH.

[10] Petr Motlícek,et al. Speech Coding Based on Spectral Dynamics , 2006, TSD.

[11] Petr Motlícek,et al. Wide-Band Perceptual Audio Coding Based on Frequency-Domain Linear Prediction , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.