Peak-to-RMS reduction of speech based on a sinusoidal model

A sinusoidal-based analysis/synthesis system is used to apply a radar design solution to the problem of dispersing the phase of a speech waveform. Unlike conventional methods of phase dispersion, this solution technique adapts dynamically to the pitch and spectral characteristics of the speech, while maintaining the original spectral envelope. The solution can also be used to drive the sine-wave amplitude modification for amplitude compression, and is coupled to the desired shaping of the speech spectrum. The proposed dispersion solution, when integrated with amplitude compression, results in a significant reduction in the peak-to-RMS (root-mean-square) ratio of the speech waveform with acceptable loss in quality. Application of a real-time prototype sine-wave preprocessor to AM radio broadcasting is described. >

[1]  B. Blesser,et al.  Audio dynamic range compression for minimum perceived distortion , 1969 .

[2]  Aaron E. Rosenberg,et al.  On reducing the buzz in LPC synthesis , 1978 .

[3]  R. Niederjohn,et al.  The enhancement of speech intelligibility in high noise levels by high-pass filtering followed by rapid amplitude compression , 1976 .

[4]  A. W. Rihaczek Principles of high-resolution radar , 1969 .

[5]  Thomas F. Quatieri,et al.  Pitch estimation and voicing detection based on a sinusoidal speech model , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6]  Adriaan van den Bos A new method for synthesis of low-peak-factor signals , 1987, IEEE Trans. Acoust. Speech Signal Process..

[7]  D. Paul The spectral envelope estimation vocoder , 1981 .

[8]  George S. Kang,et al.  Improvement of the excitation source in the narrow-band linear prediction vocoder , 1985, IEEE Trans. Acoust. Speech Signal Process..

[9]  E. N. Fowle The design of FM pulse compression signals , 1964, IEEE Trans. Inf. Theory.

[10]  R. J. McAulay,et al.  Speech transformations based on a sinusoidal representation , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  T. Quatieri,et al.  Phase modelling and its application to sinusoidal transform coding , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[13]  S. Boyd Multitone signals with low crest factor , 1986 .

[14]  Clifford J. Weinstein,et al.  The VISTA Speech Enhancement System for AM Radio Broadcasting , 1990 .

[15]  Yeunung Chen,et al.  Cepstral domain talker stress compensation for robust speech recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..

[16]  Manfred R. Schroeder,et al.  Synthesis of low-peak-factor signals and binary sequences with low autocorrelation (Corresp.) , 1970, IEEE Trans. Inf. Theory.

[17]  J. C. R. Licklider,et al.  Premodulation Clipping in AM Voice Communication , 1947 .

[18]  J D Griffiths Optimum linear filter for speech transmission. , 1968, The Journal of the Acoustical Society of America.

[19]  Aaron E. Rosenberg,et al.  On reducing the buzz in LPC synthesis , 1977 .