The Use of Pitch Prediction in Speech Coding

Two major types of correlations are present in a speech signal. These are known as near-sample redundancies and distant-sample redundancies. Near-sample redundancies are those which are present among speech samples that are close together. Distant-sample redundancies are due to the inherent periodicity of voiced speech. Predictive speech coders make use of these correlations in the speech signal to enhance coding efficiency. In predictive speech coders, the cascade of two nonrecursive prediction error filters process the original speech signal. The formant filter removes near-sample redundancies. The pitch filter acts on distant-sample waveform similarities. The result is a residual signal with little sample to sample correlations. The parameters that are quantized and coded for transmission include the filter coefficients and the residual signal. From the coded parameters, the receiver decodes the speech by passing the quantized residual through a pitch synthesis filter and a formant synthesis filter. The filtering steps at the receiver can be viewed in the frequency domain as first inserting the fine pitch structure and then, shaping the spectral envelope to insert the formant structure. The formant and pitch filters are adaptive in that the analysis to determine the coefficients is carried out frame by frame. Also, the bits representing the quantized parameters are transmitted on a frame by frame basis. The bit rate of the coder is the total number of bits transmitted in one frame divided by the time duration of the analysis frame.

[1]  Ravi P. Ramachandran,et al.  Modern methods of speech processing , 1995 .

[2]  Yair Shoham Constrained-stochastic excitation coding of speech at 4.8 kb/s , 1990, ICSLP.

[3]  Peter Kabal,et al.  Pitch prediction filters in speech coding , 1989, IEEE Trans. Acoust. Speech Signal Process..

[4]  B. Atal,et al.  Quantization procedures for the excitation in CELP coders , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  P. Kroon,et al.  Generalized analysis-by-synthesis coding and its application to pitch prediction , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  M. R. Schroeder,et al.  Adaptive predictive coding of speech signals , 1970, Bell Syst. Tech. J..

[7]  W. Bastiaan Kleijn,et al.  The RCELP speech-coding algorithm : Speech coding for telecommunications , 1994 .

[8]  Ira Alan Gerson,et al.  EFFICIENT TECHNIQUES FOR DETERMINING AND ENCODING THE LONG TERM PREDICTOR LAGS FOR ANALYSIS·BY· SYNTHESIS SPEECH CODERS , 1993 .

[9]  Allen Gersho,et al.  Speech and Audio Coding for Wireless and Network Applications , 1993 .

[10]  Ed F. Deprettere,et al.  A class of analysis-by-synthesis predictive coders for high quality speech coding at rates between 4.8 and 16 kbit/s , 1988, IEEE J. Sel. Areas Commun..

[11]  W. Bastiaan Kleijn,et al.  The RCELP speech-coding algorithm , 2010, Eur. Trans. Telecommun..

[12]  B. Atal,et al.  Predictive coding of speech signals and subjective error criteria , 1979 .

[13]  Bishnu S. Atal,et al.  ON IMPROVING THE PERFORMANCE OF PITCH PREDICTORS IN SPEECH CODING SYSTEMS , 1991 .

[14]  W. Bastiaan Kleijn On the periodicity of speech coded with linear-prediction based analysis by synthesis coders , 1994, IEEE Trans. Speech Audio Process..

[15]  Peter Kabal,et al.  Stability and performance analysis of pitch filters in speech coders , 1987, IEEE Trans. Acoust. Speech Signal Process..

[16]  M. Johnson,et al.  Pitch sharpening for perceptually improved CELP, and the sparse-delta codebook for reduced computation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[17]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[18]  Ira A. Gerson,et al.  Techniques for Improving the Performance of CELP-Type Speech Coders , 1992, IEEE J. Sel. Areas Commun..

[19]  W. Bastiaan Kleijn,et al.  An efficient stochastically excited linear predictive coding algorithm for high quality low bit rate transmission of speech , 1988, Speech Commun..

[20]  Peter Kroon,et al.  A High-Quality Multirate Real-Time CELP Coder , 1992, IEEE J. Sel. Areas Commun..

[21]  Allen Gersho,et al.  Advances in speech coding , 1991 .

[22]  Bishnu S. Atal,et al.  Improving performance of multi-pulse LPC coders at low bit rates , 1984, ICASSP.

[23]  W. Bastiaan Kleijn,et al.  Interpolation of the pitch-predictor parameters in analysis-by-synthesis speech coders , 1994, IEEE Trans. Speech Audio Process..

[24]  W. B. Kleijn,et al.  Improved pitch prediction , 1993, Proceedings., IEEE Workshop on Speech Coding for Telecommunications,.

[25]  Man Mohan Sondhi,et al.  Enhancement of ADPCM speech coding with backward-adaptive algorithms for postfiltering and noise feedback , 1988, IEEE J. Sel. Areas Commun..

[26]  Allen Gersho,et al.  Efficient Encoding of the Long-Term Predictor in Vector Excitation Coders , 1991 .

[27]  W. Bastiaan Kleijn,et al.  Linear Predictive Analysis by Synthesis Coding , 1995 .