A perceptually based embedded subband speech coder

A new scheme for robust, high-quality, embedded speech coding based on subband decomposition and perceptually optimized bit allocation and prioritization is presented. An infinite impulse response (IIR) quadrature mirror filterbank (QMF) performs subband decomposition. A perceptual model, computed using subband spectral analysis, optimizes the coder's perceptual quality. Dynamic bit allocation and prioritization is combined with embedded quantization resulting in little performance degradation relative to a nonembedded implementation. The coder output is scalable from high quality at higher bit rates to lower quality at lower bit rates, supporting a wide range of service and resource utilization. The lower bit-rate representation is obtained simply through truncation of the higher bit-rate representation. Since source-rate adaptation is performed through truncation of the encoded stream, interaction with the coder is not required, making the embedded coder ideally suited for rate-adaptive communication systems. Performance for both speech and music was verified through subjective listening tests.

[1]  Raymond Steele,et al.  Embedded delta modulation , 1988, IEEE Trans. Acoust. Speech Signal Process..

[2]  Nikil Jayant,et al.  Signal Compression: Technology Targets and Research Directions , 1992, IEEE J. Sel. Areas Commun..

[3]  Richard V. Cox,et al.  Robust speech coding for the indoor wireless channel , 1993, AT&T Technical Journal.

[4]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[5]  Allen Gersho,et al.  Advances in speech and audio compression , 1994, Proc. IEEE.

[6]  Joel Max,et al.  Quantizing for minimum distortion , 1960, IRE Trans. Inf. Theory.

[7]  Allen Gersho,et al.  Principles of quantization , 1978 .

[8]  Gerhard Stoll,et al.  ISO-MPEG-1 Audio: A Generic Standard for Coding of High-: Quality Digital Audio , 1994 .

[9]  J. W. Modestino,et al.  Combined Source-Channel Coding of Images , 1978, IEEE Trans. Commun..

[10]  Yair Shoham,et al.  New directions in subband coding , 1988, IEEE J. Sel. Areas Commun..

[11]  Abeer Alwan,et al.  Spectral analysis of subband filtered signals , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[12]  Robert J. Safranek,et al.  Signal compression based on models of human perception , 1993, Proc. IEEE.

[13]  R. E. Crochiere,et al.  Variable rate coding of speech , 1979, The Bell System Technical Journal.

[14]  David G. Messerschmitt,et al.  Embedded coding of speech: A vector quantization approach , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[16]  Carl-Erik W. Sundberg,et al.  Subband speech coding and matched convolutional channel coding for mobile radio channels , 1991, IEEE Trans. Signal Process..

[17]  Gordon Lockhart,et al.  An embedded scheme for regular pulse excited (RPE) linear predictive coding , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[18]  D. J. Goodman,et al.  Combined source and channel coding for variable-bit-rate speech transmission , 1983, The Bell System Technical Journal.

[19]  Joachim Hagenauer,et al.  Rate-compatible punctured convolutional codes (RCPC codes) and their applications , 1988, IEEE Trans. Commun..

[20]  Rosario Drogo de Iacovo,et al.  Embedded CELP coding for variable bit-rate between 6.4 and 9.6 kbit/s , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[21]  Duane O. Bowker,et al.  Performance evaluation of variable-bit-rate voice in packet-switched networks , 1988, AT&T Technical Journal.

[22]  A.N. Willson,et al.  High-performance IIR QMF banks for speech subband coding , 1994, Proceedings of IEEE International Symposium on Circuits and Systems - ISCAS '94.

[23]  P. Noll,et al.  Wideband speech and audio coding , 1993, IEEE Communications Magazine.

[24]  M. H. Sherif,et al.  Overview of CCITT embedded ADPCM algorithms , 1990, IEEE International Conference on Communications, Including Supercomm Technical Sessions.

[25]  Abeer Alwan,et al.  A robust variable-rate speech coder , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.