A novel frequency domain BWE with relaxed synchronization and associated BWE switching

This paper presents a novel frequency domain bandwidth extension (BWE) scheme with relaxed synchronization, optimized for coding inactive and music/mixed content signals. The algorithm achieves high subjective quality at low and medium bitrates and it has a low algorithmic delay. The algorithm is part of the 3GPP Enhanced Voice Services (EVS) codec. In addition to the presented algorithm, the EVS codec employs also a time domain BWE scheme optimized for active speech coding. Consequently a seamless switching between these BWE technologies is required and described in this paper as well.

[1]  Venkatesh Krishnan,et al.  Super-wideband bandwidth extension for speech in the 3GPP EVS codec , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Roch Lefebvre,et al.  The adaptive multirate wideband speech codec (AMR-WB) , 2002, IEEE Trans. Speech Audio Process..

[3]  Eunmi Oh,et al.  Bandwidth Extension for China AVS-M standard , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Peter Jax,et al.  Bandwidth Extension for Hierarchical Speech and Audio Coding in ITU-T Rec. G.729.1 , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Zhe Wang,et al.  Overview of the EVS codec architecture , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Lei Miao,et al.  Advances in low bitrate time-frequency coding , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Chen Hu,et al.  G.711.1 Annex D and G.722 Annex B - New ITU-T superwideband codecs , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Kristofer Kjörling,et al.  Spectral Band Replication, a Novel Approach in Audio Coding , 2002 .

[9]  Chen Hu,et al.  Re-engineering ITU-T G.722: Low delay and complexity superwideband coding at 64 kbit/s with G.722 bitstream watermarking , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).