Role of multi-pulse excitation in synthesis of natural-sounding voiced speech

We discuss in this paper the role of multi-pulse excitation and its importance in the synthesis of natural-sounding voiced speech. In an earlier study, Atal and David reported on the relative importance of spectral and phase properties of LPC excitation on the subjective quality of synthetic speech. This paper extends the results of Atal and David to the multi-pulse excitation. The multi-pulse analysis procedure determines the excitation by a closed-loop matching procedure which minimizes the perceptual difference between the original and synthetic speech signals. However, even for periodic voiced speech, the secondary pulses in the multi-pulse excitation do not vary systematically from one pitch period to another. We find that the multi-pulse excitation is highly periodic in the lower frequency bands and much less periodic in the higher bands. We suggest, that this irregularity contributes to the "fullness" heard in multi-pulse synthesized speech. We have investigated the influence on speech quality of replacing the original multi-pulse patterns in the excitation with a fixed multi-pulse pattern selected randomly from the multi-pulse excitation of a voiced speech segment. Our results suggest that the fixed multi-pulse patterns introduce only small degradations in synthetic speech.

[1]  Bishnu S. Atal,et al.  Predictive Coding of Speech at Low Bit Rates , 1982, IEEE Trans. Commun..

[2]  K. Ozawa,et al.  High quality multi-pulse speech coder with pitch prediction , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Bishnu S. Atal,et al.  Periodic repetition of multi‐pulse excitation , 1983 .

[4]  Bishnu S. Atal,et al.  Improving performance of multi-pulse LPC coders at low bit rates , 1984, ICASSP.

[5]  K. Ozawa,et al.  Low bit rate multi-pulse speech coder with natural speech quality , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  J. Makhoul,et al.  A mixed‐source model for speech compression and synthesis , 1978 .

[7]  Bishnu S. Atal,et al.  On synthesizing natural-sounding speech by linear prediction , 1979, ICASSP.

[8]  Bishnu S. Atal,et al.  A new model of LPC excitation for producing natural-sounding speech at low bit rates , 1982, ICASSP.