Low-complexity semi-parametric joint-stereo audio transform coding

Traditional audio codecs based on real-valued transforms utilize separate and largely independent algorithmic schemes for parametric coding of noise-like or high-frequency spectral components as well as channel pairs. It is shown that in the frequency-domain part of coders such as Extended HE-AAC, these schemes can be unified into a single algorithmic block located at the core of the modified discrete cosine transform path, enabling greater flexibility like semi-parametric coding and large savings in codec delay and complexity. This paper focuses on the stereo coding aspect of this block and demonstrates that, by using specially chosen spectral configurations when deriving the parametric side-information in the encoder, perceptual artifacts can be reduced and the spatial processing in the decoder can remain real-valued. Listening tests confirm the benefit of our proposal at intermediate bit-rates.

[1]  Seung Ho Choi,et al.  Superwideband Bandwidth Extension Using Normalized MDCT Coefficients for Scalable Speech and Audio Coding , 2013, Adv. Multim..

[2]  Raymond N. J. Veldhuis,et al.  Subband coding of stereophonic digital audio signals , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[3]  B. Moore An Introduction to the Psychology of Hearing , 1977 .

[4]  K. Suresh,et al.  MDCT domain parametric stereo audio coding , 2012, 2012 International Conference on Signal Processing and Communications (SPCOM).

[5]  Jeroen Breebaart,et al.  An Overview of the Coding Standard MPEG-4 Audio Amendments 1 and 2: HE-AAC, SSC, and HE-AAC v2 , 2009, EURASIP J. Audio Speech Music. Process..

[6]  RECOMMENDATION ITU-R BS.1534-1 - Method for the subjective assessment of intermediate quality level of coding systems , 2003 .

[7]  Ruimin Hu,et al.  Spatial Parameter Choosing Method Based on Spatial Perception Entropy Judgment , 2012, 2012 8th International Conference on Wireless Communications, Networking and Mobile Computing.

[8]  Gerald Schuller,et al.  A MDCT based harmonic spectral bandwidth extension method , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Bernd Edler,et al.  Aliasing Reduction for Modified Discrete Cosine Transform Domain Filtering and its Application to Speech Enhancement , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[10]  Roch Lefebvre,et al.  Extended AMR-WB for high-quality audio on mobile devices , 2006, IEEE Communications Magazine.

[11]  Jeroen Breebaart,et al.  Parametric Coding of Stereo Audio , 2005, EURASIP J. Adv. Signal Process..

[12]  Sascha Disch,et al.  Efficient transform coding of two-channel audio signals by means of complex-valued stereo prediction , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Timothy B. Terriberry,et al.  Definition of the Opus Audio Codec , 2012, RFC.

[14]  Sascha Disch,et al.  Temporal Tile Shaping for spectral gap filling in audio transform coding in EVS , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Christopher Gribben,et al.  The Perceptual Effects of Horizontal and Vertical Interchannel Decorrelation, using the Lauridsen Decorrelator , 2014 .

[16]  Anssi Rämö,et al.  Scalable superwideband extension for wideband coding , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Andreas Niedermeier,et al.  Spectral envelope reconstruction via IGF for audio transform coding , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Kosuke Tsujino,et al.  Low-complexity Bandwidth Extension in MDCT domain for low-bitrate speech coding , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Gerald Schuller,et al.  A Guideline to Audio Codec Delay , 2004 .

[20]  Elizabeth A. Strickland,et al.  An Introduction to the Psychology of Hearing (6th edition) , 2014 .

[21]  Weibei Dou,et al.  MDCT Sinusoidal Analysis for Audio Signals Analysis and Processing , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  John Princen,et al.  Subband/Transform coding using filter bank designs based on time domain aliasing cancellation , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.