Low-Delay Speech Coding at 16 kb/s and Below

Development of network quality speech coders at 16 kb/s and below is an active research area. This thesis focuses on the study of low-delay Code Excited Linear Predictive (CELP) and tree coders. A 16 kb/s stochastic tree coder based on the (M,L) search algorithm suggested by Iyengar and Kabal and a low-delay CELP coder proposed by AT&T (CCITT 16 kb/s standardization candidate) are examined. The first goal is to compare and study the performance of the two coders. Second objective is to analyze the particular characteristics which make the two coders different from one another. The final goal is the improvement of the performance of the coders, particularly with a view of bringing down the bit rate below 16 kb/s. When compared under similar conditions, the two coders showed comparable performance at 16 kb/s. The analysis of the components and particular characteristics of the tree and CELP coders provide new insight for future coders. Higher performance coder components such as prediction, gain adaptation, and residual signal quantization are needed. Issues in backward adaptive linear prediction analysis for both near and far-sample redundancy removal such as analysis methods, windowing, ill-conditioning, quantization noise effects and computational complexities are studied. Several new backward adaptive high-order methods show much better prediction gains than the previously reported ones. Other than a better high-order predictor for both coders, other suggestions to improve the performance of the coders include a new scheme of training of the excitation dictionary and better gain adaptation strategy for the tree coder. A hybrid "Tree-CELP" coder, taking the best components from the two archetypes is a good candidate to push coding rates below 16 kb/s.

[1]  Nuggehally Sampath Jayant,et al.  Tree-Encoding of Speech Using the (M, L)-Algorithm and Adaptive Quantization , 1978, IEEE Trans. Commun..

[2]  Bishnu S. Atal,et al.  Improving performance of multi-pulse LPC coders at low bit rates , 1984, ICASSP.

[3]  Allen Gersho,et al.  Gain-adaptive vector quantization for medium-rate speech coding , 1985 .

[4]  K. Zeger,et al.  Zero redundancy channel coding in vector quantisation , 1987 .

[5]  K. H. Barratt Digital Coding of Waveforms , 1985 .

[6]  N. Jayant Adaptive quantization with a one-word memory , 1973 .

[7]  J.J. Shynk,et al.  Backward adaptation for low delay vector excitation coding of speech at 16 kbit/s , 1989, IEEE Global Telecommunications Conference, 1989, and Exhibition. 'Communications Technology for the 1990s and Beyond.

[8]  Yair Shoham On the use of direct vector quantization in LPC-based analysis-by-synthesis coding systems , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[9]  Bishnu S. Atal,et al.  A new model of LPC excitation for producing natural-sounding speech at low bit rates , 1982, ICASSP.

[10]  Michael W. Marcellin,et al.  Predictive trellis coded quantization of speech , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11]  Peter Strobach,et al.  Linear Prediction Theory: A Mathematical Basis for Adaptive Systems , 1990 .

[12]  Manfred R. Schroeder,et al.  Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  R. Gray,et al.  Product code vector quantizers for waveform and voice coding , 1984 .

[14]  Thomas P. Barnwell,et al.  Recursive windowing for generating autocorrelation coefficients for LPC analysis , 1981 .

[15]  Peter Kabal,et al.  Pitch prediction filters in speech coding , 1989, IEEE Trans. Acoust. Speech Signal Process..

[16]  V. Ramamoorthy,et al.  Enhancement of ADPCM speech by adaptive postfiltering , 1984, AT&T Bell Laboratories Technical Journal.

[17]  P. Kabal,et al.  A low delay 16 kbits/sec speech coder , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[18]  Juin-Hwey Chen,et al.  Real-time implementation and performance of a 16 kb/s low-delay CELP speech coder , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[19]  John Makhoul,et al.  Adaptive lattice analysis of speech , 1981 .

[20]  Jianfeng Chen,et al.  A robust low-delay CELP speech coder at 16 kbits/s , 1989 .

[21]  Peter No,et al.  Digital Coding of Waveforms , 1986 .

[22]  Biingh-Wang Juang Design and performance of trellis vector quantizers for speech signals , 1988, IEEE Trans. Acoust. Speech Signal Process..

[23]  Peter Kabal,et al.  High Quality Low-Delay Speech Coding at 12 kb/s , 1993 .

[24]  Jerry D. Gibson,et al.  Fractional rate multi-tree speech coding , 1989, IEEE Global Telecommunications Conference, 1989, and Exhibition. 'Communications Technology for the 1990s and Beyond.

[25]  Peter Strobach Pure order recursive least-squares ladder algorithms , 1986, IEEE Trans. Acoust. Speech Signal Process..

[26]  Y.-C. Lin,et al.  A fixed-point 16 kb/s LD-CELP algorithm , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[27]  B. Atal High-quality speech at low bit rates: Multi-pulse and stochastically excited linear predictive coders , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[28]  G. Longo Source Coding Theory , 1970 .

[29]  N.S. Jayant High-quality coding of telephone speech and wideband audio , 1990, IEEE Communications Magazine.

[30]  J. Makhoul Stable and efficient lattice methods for linear prediction , 1977 .

[31]  Peter Kabal,et al.  Backward adaptive prediction: high-order predictors and formant-pitch configurations , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[32]  Bishnu S. Atal Predictive Coding of Speech at Low Bit Rates , 1982, IEEE Trans. Commun..

[33]  Peter Kabal,et al.  Low-delay CELP and tree coders: comparison and performance improvements , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[34]  Juin-Hwey Chen,et al.  High-quality 16 kb/s speech coding with a one-way delay less than 2 ms , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[35]  Allen Gersho,et al.  Gain-Adaptive Vector Quantization with Application to Speech Coding , 1987, IEEE Trans. Commun..

[36]  Peter Kabal,et al.  Stability and performance analysis of pitch filters in speech coders , 1987, IEEE Trans. Acoust. Speech Signal Process..

[37]  P. Noll,et al.  Multipath Search Coding of Stationary Signals with Applications to Speech , 1982, IEEE Trans. Commun..

[38]  Ed F. Deprettere,et al.  A class of analysis-by-synthesis predictive coders for high quality speech coding at rates between 4.8 and 16 kbit/s , 1988, IEEE J. Sel. Areas Commun..

[39]  Allen Gersho,et al.  Real-time vector APC speech coding at 4800 bps with adaptive postfiltering , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[40]  Nuggehally Sampath Jayant,et al.  Adaptive postfiltering of 16 kb/s-ADPCM speech , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[41]  Peter Kabal,et al.  A low delay 16 kb/s speech coder , 1991, IEEE Trans. Signal Process..

[42]  Peter Kabal,et al.  Joint optimization of linear predictors in speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[43]  V. Cuperman,et al.  Backward pitch prediction for low-delay speech coding , 1989, IEEE Global Telecommunications Conference, 1989, and Exhibition. 'Communications Technology for the 1990s and Beyond.

[44]  Bradley W. Dickinson,et al.  Autoregressive estimation using residual energy ratios (Corresp.) , 1978, IEEE Trans. Inf. Theory.

[45]  Bishnu S. Atal,et al.  Predictive coding of speech signals and subjective error criteria , 1978, ICASSP.

[46]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[47]  Aldo Cumani On a covariance-lattice algorithm for linear prediction , 1982, ICASSP.

[48]  Robert M. Gray,et al.  Product code vector quantizers for speech waveform coding , 1982 .

[49]  Peter Strobach Recursive triangular array ladder algorithms , 1991, IEEE Trans. Signal Process..