Adaptive Spectral Masking of AVQ Coding and Sparseness Detection for ITU-T G.711.1 Annex D and G.722 Annex B Standards

We proposes a new adaptive spectral masking method of algebraic vector quantization (AVQ) for non-sparse signals in the modified discreet cosine transform (MDCT) domain. This paper also proposes switching the adaptive spectral masking on and off depending on whether or not the target signal is non-sparse. The switching decision is based on the results of MDCT-domain sparseness analysis. When the target signal is categorized as non-sparse, the masking level of the target MDCT coefficients is adaptively controlled using spectral envelope information. The performance of the proposed method, as a part of ITU-T G.711.1 Annex D, is evaluated in comparison with conventional AVQ. Subjective listening test results showed that the proposed method improves sound quality by more than 0.1 points on a five-point scale on average for speech, music, and mixed content, which indicates significant improvement. key words: speech and audio coding, standardization, ITU-T G.711.1 Annex D, ITU-T G.722 Annex B, super-wideband (SWB) extension, algebraic vector quantization (AVQ)

[1]  Masahiro Oshikiri,et al.  ITU-T EV-VBR: A robust 8-32 kbit/s scalable coder for error prone telecommunications channels , 2008, 2008 16th European Signal Processing Conference.

[2]  Chen Hu,et al.  G.711.1 Annex D and G.722 Annex B - New ITU-T superwideband codecs , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Xavier Maitre,et al.  7 kHz audio coding within 64 kbit/s , 1988, IEEE J. Sel. Areas Commun..

[4]  John Princen,et al.  Analysis/Synthesis filter bank design based on time domain aliasing cancellation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[5]  Jianfeng Xu,et al.  G.711.1: A wideband extension to ITU-T G.711 , 2008, 2008 16th European Signal Processing Conference.

[6]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[7]  Roch Lefebvre,et al.  Low-complexity multi-rate lattice vector quantization with application to wideband TCX speech coding at 32 kbit/s , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[9]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[10]  Yusuke Hiwasaki,et al.  ITU-T G.711.1: extending G.711 to higher-quality wideband speech , 2009, IEEE Communications Magazine.