ITU-T EV-VBR: A robust 8-32 kbit/s scalable coder for error prone telecommunications channels

This paper presents ITU-T Embedded Variable Bit-Rate (EV-VBR) codec being standardized by Question 9 of Study Group 16 (Q9/16) as recommendation G.718. The codec provides a scalable solution for compression of 16 kHz sampled speech and audio signals at rates between 8 kbit/s and 32 kbit/s, robust to significant rates of frame erasures or packet losses. It comprises 5 layers where higher layer bitstreams can be discarded without affecting the lower layer decoding. The core layer takes advantage of signal-classification based CELP encoding. The second layer reduces the coding error from the first layer by means of additional pitch contribution and another algebraic codebook. The higher layers encode the weighted error signal from lower layers using MDCT transform coding. Several technologies are used to encode the MDCT coefficients for best performance both for speech and music. The codec performance is demonstrated with selected results from ITU-T Characterization test.

[1]  Redwan Salami,et al.  Noise reduction method for wideband speech coding , 2004, 2004 12th European Signal Processing Conference.

[2]  Roch Lefebvre,et al.  The adaptive multirate wideband speech codec (AMR-WB) , 2002, IEEE Trans. Speech Audio Process..

[3]  James P. Ashley,et al.  Joint optimization of excitation parameters in analysis-by-synthesis speech coders having multi-tap long term predictor , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Redwan Salami,et al.  Wideband Speech Coding Advances in VMR-WB Standard , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  James P. Ashley,et al.  Low Complexity Factorial Pulse Coding of MDCT Coefficients using Approximation of Combinatorial Functions , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Milan Jelinek,et al.  Transition mode coding for source controlled celp codecs , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Anssi Rämö,et al.  ITU-T G.EV-VBR baseline codec , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Thomas Eriksson,et al.  Interframe LSF quantization for noisy channels , 1999, IEEE Trans. Speech Audio Process..

[9]  Yuval Bistritz,et al.  Immittance spectral pairs (ISP) for speech encoding , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Roch Lefebvre,et al.  Low-complexity multi-rate lattice vector quantization with application to wideband TCX speech coding at 32 kbit/s , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Roch Lefebvre,et al.  Efficient Frame Erasure Concealment in Predictive Speech Codecs using Glottal Pulse Resynchronisation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[12]  Koji Yoshida,et al.  An 8-32 kbit/s scalable wideband coder extended with MDCT-based bandwidth extension on top of a 6.8 kbit/s narrowband CELP coder , 2007, INTERSPEECH.

[13]  Pei-Jung Chung,et al.  Proc. IEEE ICASSP , 2000 .