A variable-bit-rate speech coding algorithm based on enhanced mixed excitation linear prediction

In order to improve the channel bandwidth utilization of voice communication, a variable bit rate speech coding algorithm based on enhanced mixed excitation linear prediction (MELPe) is proposed in the paper. In voice communication, only about 40% of the time is occupied by talking, whereas the rest is engaged by silence or background noise. In addition, unvoiced frame usually requires less transmission rate than the voiced one in low bit rate speech coding algorithms. Therefore, always using the same coding bit rate for speech coding is a waste of channel resource. In this paper, the input signal is divided into speech and silence by using voice activity detection (VAD) technology. And the speech frames are divided into voiced frame or unvoiced frame. They use different coding rates for speech coding and data transmission. All of the parameters are encoded, transmitted and decoded in voiced frame. Only gain parameters, LSF parameters, pitch parameters and overall voicing are encoded, transmitted and decoded in the unvoiced frame. Furthermore, only the gain parameters and the first level LSF parameters are encoded, transmitted and decoded in the silence frame. When about 40% of the time is occupied by talking, compare with the traditional 2.4 kbps MELPe vocoder, the average coding rate of the proposed variable bit rate vocoder can reach 1.33 kbps. But they can achieve the same quality of synthetic speech. Experimental results show that the proposed method reduces the average coding rate, and the synthetic background noise has good comfort on the subjective sense of hearing.

[1]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[2]  Thomas P. Barnwell,et al.  MCCREE AND BARNWELL MIXED EXCITAmON LPC VOCODER MODEL LPC SYNTHESIS FILTER 243 SYNTHESIZED SPEECH-PERIODIC PULSE TRAIN-1 PERIODIC POSITION JITTER PULSE 4 , 2004 .

[3]  Allen Gersho,et al.  A 1200/2400 bps coding suite based on MELP , 2002, Speech Coding, 2002, IEEE Workshop Proceedings..

[4]  John S. Collura,et al.  MELP: the new Federal Standard at 2400 bps , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Ye Li,et al.  A 2400 bps vocoder based on mixed excitation linear prediction and channel coding , 2015, 2015 8th International Congress on Image and Signal Processing (CISP).

[6]  R. C. de Lamare,et al.  Strategies to improve the performance of very low bit rate speech coders and application to a variable rate 1.2 kb/s codec , 2005 .

[7]  R.V. Cox,et al.  An intelligibility enhancement for the mixed excitation linear prediction speech coder , 2003, IEEE Signal Processing Letters.

[8]  Allen Gersho,et al.  A 1200 bps speech coder based on MELP , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Amir K. Khandani,et al.  Robust Transmission of Multistage Vector Quantized Sources Over Noisy Communication Channels—Applications to MELP Speech Codec , 2006, IEEE Transactions on Vehicular Technology.

[10]  Ye Li,et al.  A 1.8kbps vocoder based on Mixed Excitation Linear Prediction , 2015, 2015 IEEE International Conference on Progress in Informatics and Computing (PIC).

[11]  Mary A. Kohler A comparison of the new 2400 bps MELP Federal Standard with other standard coders , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  D. J. Rahikka,et al.  The 1.2 kbps/2.4 kbps MELP speech coding suite with integrated noise pre-processing , 1999, MILCOM 1999. IEEE Military Communications. Conference Proceedings (Cat. No.99CH36341).