The Pitch Mode Modulation Model and Its Application in Speech Processing

The techniques currently used for speech coding or enhancement critically depend upon some form of statistical stationarity either in the speech signal the noise signal or both in order to accomplish the coding or enhancement. Virtually all speech processing techniques utilize a speech model to reduce the amount of information necessary to characterize the speech signal. Although the speech signal is known to be highly redundant it is also non-stationary. This non-stationarity requires that the parameters of these models be extracted from short duration signal segments, where the stationarity assumption in the models is not seriously violated. Unfortunately the use of short speech frames makes the estimation of the model parameters difficult and sometimes obscures the very redundancy the model was based on. The use of a longer frame size is desirable for many signal processing techniques that require increased frequency domain resolution.

[1]  Luís B. Almeida,et al.  Nonstationary spectral modeling of voiced speech , 1983 .

[2]  R. J. Mammone,et al.  New speech enhancement techniques using the pitch mode modulation model , 1993, Proceedings of 36th Midwest Symposium on Circuits and Systems.

[3]  J.H.S. Chan,et al.  Comparison of static and dynamic bandwidth allocation schemes for multiple QOS classes in ATM networks , 1994, Proceedings of ICCS '94.

[4]  Tat Soon Yeo,et al.  SAR real time motion compensation: average cancellation method for aircraft , 1994, Proceedings of ICCS '94.

[5]  David Malah,et al.  Optimal multi-pitch estimation using the EM algorithm for co-channel speech separation , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[7]  Michael A. Ramalho,et al.  A new speech enhancement technique with application to speaker identification , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  James L. Flanagan,et al.  Speech recognition using the modulation model , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Jae S. Lim,et al.  Speech enhancement , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  David Malah,et al.  Estimation of the parameters of a long-term model for accurate representation of voiced speech , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  A. Papoulis,et al.  Random modulation: A review , 1983 .

[12]  R. J. Webster,et al.  Spectral line profiles generated by deterministic frequency modulation , 1991, IEEE Trans. Signal Process..

[13]  Boualem Boashash,et al.  Time-Frequency Signal Analysis: Methods and Applications. , 1993 .

[14]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[15]  R.S. Simpson Introduction to communications systems , 1978, Proceedings of the IEEE.

[16]  Mingui Sun,et al.  Discrete-time instantaneous frequency and its computation , 1993, IEEE Trans. Signal Process..

[17]  Louis A. Pipes,et al.  Applied Mathematics for Engineers and Physicists , 1959 .

[18]  Boualem Boashash,et al.  Instantaneous frequency, instantaneous bandwidth and the analysis of multicomponent signals , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[19]  Jae S. Lim,et al.  Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.