Multiple-description coding (MDC) of speech with an invertible auditory model

Network signal processing aspects dominate in speech and audio coding applications such as Internet telephony or packet radio networks. We demonstrate that our approach to speech coding in a perceptual domain provides an implicit forward error concealment mechanism to handle random erasures of the channel. To this end, the individual acoustic subchannels of our auditory model are grouped into different transport subchannels or packets. Due to the strongly overlapping, redundant filterbank structure of the model, reconstruction of speech without audible degradation becomes possible even if a significant percentage of channels is erased (e.g., up to 40% in a 50-channel auditory model for narrowband speech). We discuss this result both from a hearing-physiology and a frame-theoretic perspective.

[1]  Yao Wang,et al.  Error control and concealment for video communication: a review , 1998, Proc. IEEE.

[2]  Stéphane Mallat,et al.  Characterization of Signals from Multiscale Edges , 2011, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  N. Jayant Subsampling of a DPCM speech channel to provide two “self-contained” half-rate channels , 1981, The Bell System Technical Journal.

[4]  Jont B. Allen,et al.  How do humans process and recognize speech? , 1993, IEEE Trans. Speech Audio Process..

[5]  John S. Baras,et al.  Properties of the multiscale maxima and zero-crossings representations , 1993, IEEE Trans. Signal Process..

[6]  Helmut Bölcskei,et al.  Frame-theoretic analysis of oversampled filter banks , 1998, IEEE Trans. Signal Process..

[7]  Vivek K. Goyal,et al.  Optimal multiple description transform coding of Gaussian vectors , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[8]  Gernot Kubin,et al.  On speech coding in a perceptual domain , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9]  Yao Wang Multiple description coding using non-hierarchical signal decomposition , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[10]  Martin Vetterli,et al.  Discrete-time wavelet extrema representation: design and consistent reconstruction , 1995, IEEE Trans. Signal Process..

[11]  Vivek K Goyal,et al.  Multiple description transform coding: robustness to erasures using tight frame expansions , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[12]  Haja N. Razafinjatovo,et al.  Iterative Reconstructions in Irregular Sampling With Derivatives , 1994 .