Joint source-channel decoding of speech spectrum parameters over an AWGN channel using Gaussian mixture models

We show how the Gaussian mixture modelling framework used to develop efficient source encoding schemes can be further exploited to model source statistics during channel decoding in an iterative framework to develop an effective joint source-channel decoding scheme. The joint probability density function (PDF) of successive source frames is modelled as a Gaussian mixture model (GMM). Based on previous work, the marginal source statistics provided by the GMM is used at the encoder to design a low-complexity memoryless source encoding scheme. The source encoding scheme has the specific advantage of providing good estimates to the probability of occurrence of a given source code-point based on the GMM. The proposed iterative decoding procedure works with any channel code whose decoder can implement the soft-output Viterbi algorithm that uses a priori information (APRI-SOVA) to provide extrinsic information on each source encoded bit. The source decoder uses the GMM model and the channel decoder output to provide a priori information back to the channel decoder. Decoding is done in an iterative manner by trading extrinsic information between the source and channel decoders. Experimental results showing improved decoding performance are provided in the application of speech spectrum parameter compression and communication.

[1]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[2]  P. Hedelin,et al.  Recursive coding of spectrum parameters , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[3]  Bhaskar D. Rao,et al.  Joint source-channel decoding of speech spectrum parameters over erasure channels using Gaussian mixture models , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Joachim Hagenauer,et al.  Source-controlled channel decoding , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[5]  Bhaskar D. Rao,et al.  PDF optimized parametric vector quantization of speech line spectral frequencies , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[6]  Peter Vary,et al.  Iterative source-channel decoder using extrinsic information from softbit-source decoding , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[7]  A. Glavieux,et al.  Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.

[8]  Bhaskar D. Rao,et al.  PDF optimized parametric vector quantization of speech line spectral frequencies , 2003, IEEE Trans. Speech Audio Process..

[9]  Peter Vary,et al.  Softbit speech decoding: a new approach to error concealment , 2001, IEEE Trans. Speech Audio Process..

[10]  Peter Vary,et al.  Softbit-source decoding based on the turbo-principle , 2001, IEEE 54th Vehicular Technology Conference. VTC Fall 2001. Proceedings (Cat. No.01CH37211).