Harmonic model for MDCT based audio coding with LPC envelope

Conventional music coders, based on a modified discrete cosine transform (MDCT) suffer greatly when lowering their bit-rate and delay. In particular, tonal music signals are penalized by short analysis windows and the variable length coding of the quantized MDCT coefficients demands a significant amount of bits for coding the harmonic structure. For solving such an issue, the paper proposes a frequency-domain harmonic model aiming to amend the probability model of the variable length coding of the quantized MDCT coefficients. The new model was combined successfully with an envelope based arithmetic coding at rate lower than 10 kbps, and with a context based arithmetic coding at higher bit rates in the recent 3 GPP EVS (Enhanced Voice Services) codec standard. Objective and subjective quality tests indicate that the proposed harmonic model enhances the quality of music for low-delay audio coding.

[1]  Schuyler R. Quackenbush MPEG Unified Speech and Audio Coding , 2013, IEEE MultiMedia.

[2]  F. Itakura Line spectrum representation of linear predictor coefficients of speech signals , 1975 .

[3]  Ralf Geiger,et al.  MDCT-based coder for highly adaptive speech and audio coding , 2009, 2009 17th European Signal Processing Conference.

[4]  Zhe Wang,et al.  Overview of the EVS codec architecture , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Tomas Bäckström,et al.  Arithmetic coding of speech and audio spectra using tcx based on linear predictive spectral envelopes , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Timothy B. Terriberry,et al.  Definition of the Opus Audio Codec , 2012, RFC.

[7]  Takehiro Moriya,et al.  Low delay LPC and MDCT-based audio coding in the EVS codec , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Information technology — Generic coding of moving pictures and associated audio information — Part 2 : Video Technologies , 2022 .

[9]  S. Hayashi,et al.  Design and description of CS-ACELP: a toll quality 8 kb/s speech coder , 1998, IEEE Trans. Speech Audio Process..

[10]  Takao Kobayashi,et al.  A hardware implementation of a new narrow to medium band speech coding , 1982, ICASSP.

[11]  Sascha Disch,et al.  MPEG Unified Speech and Audio Coding-The ISO/MPEG Standard for High-Efficiency Audio Coding of All C , 2012 .

[12]  Takehiro Moriya,et al.  High-quality audio coding at less than 64 kbit/s by using TwinVQ , 1995 .