论文信息 - Perceptual Audio Coding with Adaptive Non-uniform Time/frequency Tilings Using Subband Merging and Time Domain Aliasing Reduction

Perceptual Audio Coding with Adaptive Non-uniform Time/frequency Tilings Using Subband Merging and Time Domain Aliasing Reduction

In this paper, we investigate the coding efficiency of perceptual coding using an adaptive non-uniform orthogonal filter-bank based on MDCT analysis/synthesis and time domain aliasing reduction. We compare its performance to a system using a traditional adaptive uniform MDCT filterbank with window switching. The comparison is performed using a listening test at two different quantization settings. The statistical evaluation shows that the percetpual quality of the nonuniform filterbank significantly out-performs that of the uniform filterbank by 5 to 10 MUSHRA points.

Bernd Edler | Nils Werner | Nils Werner | B. Edler

[1] Jörn Ostermann,et al. Combination of Different Perceptual Models with Different Audio Transform Coding Schemes:Implementation and Evaluation , 2010 .

[2] P. Noll,et al. A new orthonormal wavelet packet decomposition for audio coding using frequency-varying modulated lapped transforms , 1995, Proceedings of 1995 Workshop on Applications of Signal Processing to Audio and Accoustics.

[3] Bernd Edler. Codierung von Audiosignalen mit überlappender Transformation und adaptiven Fensterfunktionen , 1989 .

[4] Bernd Edler,et al. Nonuniform Orthogonal Filterbanks Based on MDCT Analysis/Synthesis and Time-Domain Aliasing Reduction , 2017, IEEE Signal Processing Letters.

[5] Sascha Dick,et al. Efficient Multichannel Audio Transform Coding with Low Delay and Complexity , 2016 .

[6] Kenneth Rose,et al. Trellis-Based Approaches to Rate-Distortion Optimized Audio Encoding , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[7] Louis Dunn Fielder,et al. ISO/IEC MPEG-2 Advanced Audio Coding , 1997 .

[8] Thibaud Necciari,et al. A quasi-orthogonal, invertible, and perceptually relevant time-frequency transform for audio coding , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[9] B. Moore,et al. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. , 1983, The Journal of the Acoustical Society of America.

[10] J. D. Johnston,et al. Estimation of perceptual entropy using noise masking criteria , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11] Armin Taghipour,et al. A psychoacoustic model with Partial Spectral Flatness Measure for tonality estimation , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[12] R. Heusdens,et al. Flexible frequency decompositions for cosine-modulated filter banks , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13] Paul A. Viola,et al. Online decoding of Markov models under latency constraints , 2006, ICML.

[14] Touradj Ebrahimi,et al. The MPEG-4 Book , 2002 .

[15] John Princen,et al. Subband/Transform coding using filter bank designs based on time domain aliasing cancellation , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.