A Multipulse-Based Forward Error Correction Technique for Robust CELP-Coded Speech Transmission Over Erasure Channels

The widely used code-excited linear prediction (CELP) paradigm relies on a strong interframe dependency which renders CELP-based codecs vulnerable to packet loss. The use of long-term prediction (LTP) or adaptive codebooks (ACB) is the main source of interframe dependency in these codecs, since they employ the excitation from previous frames. After a frame erasure, previous excitation is unavailable and a desynchronization between the encoder and the decoder appears, causing an additional distortion which is propagated to the subsequent frames. In this paper, we propose a novel media-specific Forward Error Correction (FEC) technique which retrieves LTP-resynchronization with no additional delay at the cost of a very small bit of overhead. In particular, the proposed FEC code contains a multipulse signal which replaces the excitation of the previous frame (i.e., ACB memory) when this has been lost. This multipulse description of the previous excitation is optimized to minimize the perceptual error between the synthesized speech signal and the original one. To this end, we develop a multipulse formulation which includes the additional CELP processing and, in addition, can cope with the presence of advanced LTP filters and the usual subframe segmentation applied in modern codecs. Finally, a quantization scheme is proposed to encode pulse parameters. Objective and subjective quality tests applied to our proposal show that the propagation error due to LTP filter can practically be removed with a very little bandwidth increase.

[1]  Roch Lefebvre,et al.  Efficient Frame Erasure Concealment in Predictive Speech Codecs using Glottal Pulse Resynchronisation , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[2]  Ángel M. Gómez,et al.  Recognition of coded speech transmitted over wireless channels , 2006, IEEE Transactions on Wireless Communications.

[3]  M. Serizawa,et al.  A packet loss recovery method using packet arrived behind the playout time for CELP decoding , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  José B. Mariño,et al.  Albayzin speech database: design of the phonetic corpus , 1993, EUROSPEECH.

[5]  Philippe Gournay,et al.  Fast Recovery for a CELP-Like Speech Codec After a Frame Erasure , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Manohar N. Murthi,et al.  On packet loss concealment artifacts and their implications for packet labeling in voice over IP , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[7]  Sugato Chakravarty,et al.  Method for the subjective assessment of intermedi-ate quality levels of coding systems , 2001 .

[8]  Bishnu S. Atal,et al.  Amplitude optimization and pitch prediction in multipulse coders , 1989, IEEE Trans. Acoust. Speech Signal Process..

[9]  Koji Yoshida,et al.  Decoder Initializing Technique for Improving Frame-Erasure Resilience of a CELP Speech Codec , 2008, IEEE Transactions on Multimedia.

[10]  Ángel M. Gómez,et al.  A scalable coding scheme based on interframe dependency limitation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Philippe Gournay,et al.  On the architecture of the cdma2000/spl reg/ variable-rate multimode wideband (VMR-WB) speech coding standard , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Ángel M. Gómez,et al.  Combining Media-Specific FEC and Error Concealment for Robust Distributed Speech Recognition Over Loss-Prone Packet Channels , 2006, IEEE Transactions on Multimedia.

[13]  Jan Skoglund,et al.  iLBC - a linear predictive coder with robustness to packet losses , 2002, Speech Coding, 2002, IEEE Workshop Proceedings..

[14]  Costas S. Xydeas,et al.  Model-based packet loss concealment for AMR coders , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[15]  Kazunori Ozawa,et al.  An adaptive multi-rate speech codec based on MP-CELP coding algorithm for ETSI AMR standard , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[16]  Milan Jelinek,et al.  Transition mode coding for source controlled celp codecs , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Bishnu S. Atal,et al.  A new model of LPC excitation for producing natural-sounding speech at low bit rates , 1982, ICASSP.

[18]  Philippe Gournay,et al.  A study of design compromises for speech coders in packet networks , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.