Two-channel speech analysis

We present a two-channel technique for improving speech analysis in certain applications. One channel is the signal from an electroglottograph (EGG), which monitors the vibratory motion of the vocal folds. The other channel is the speech signal obtained from a conventional microphone. We show how the EGG can be used as a tool for validating speech processing algorithms and estimating possible lower bounds for both computation and performance of these algorithms, particularly closed-phase speech analysis. Our system is used to classify speech segments as voiced, unvoiced, mixed voiced, and silent and to estimate the fundamental frequency of voicing. This four-way classification is not implemented as a complete algorithm; it still requires some user judgments and decisions. The technical results, however, illustrate an EGG-based algorithm for voiced/unvoiced-silent classification. In addition, we illustrate how automatic on-line inverse filtering can be achieved. The results demonstrate the superiority of the closed-phase covariance analysis method over several other commonly used methods. Source-tract coupling is shown to be a significant factor in linear prediction analysis, a factor commonly ignored to date. Various applications of our two-channel approach are described along with the major disadvantage, namely, that in some situations the EGG channel cannot be acquired.

[1]  T. V. Ananthapadmanabha,et al.  Calculation of true glottal flow and its components , 1982, Speech Commun..

[2]  D. Veeneman,et al.  Automatic glottal inverse filtering from speech and electroglottographic signals , 1985, IEEE Trans. Acoust. Speech Signal Process..

[3]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[4]  S. Chandra,et al.  Experimental comparison between stationary and nonstationary formulations of linear prediction applied to voiced speech analysis , 1974 .

[5]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[6]  B. Yegnanarayana,et al.  Voice conversion: Factors responsible for quality , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Donald G. Childers,et al.  Variability in closed phase analysis of speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Aaron E. Rosenberg,et al.  A semiautomatic pitch detector (SAPD) , 1975 .

[9]  L. Siegel,et al.  Voiced/Unvoiced/Mixed excitation classification of speech , 1982 .

[10]  A. Gray,et al.  Least squares glottal inverse filtering from the acoustic speech waveform , 1979 .

[11]  Wolfgang Hess,et al.  Pitch Determination of Speech Signals , 1983 .

[12]  Bayya Yegnanarayana,et al.  Voice Simulation: Factors Affecting Quality And Naturalness , 1984, COLING.

[13]  Y Lebrun,et al.  On the so-called 'dissociations' between electroglottogram and phonogram. , 1971, Folia phoniatrica.

[14]  Kuldip K. Paliwal,et al.  A modified autocorrelation method of linear prediction for pitch-synchronous analysis of voiced speech , 1981 .

[15]  G. P. Moore,et al.  Relationships between electroglottograph, speech, and vocal cord contact. , 1984, Folia phoniatrica.

[16]  B. Yegnanarayana,et al.  Measuring source-tract interaction from speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  J. Flanagan,et al.  Synthesis of voiced sounds from a two-mass model of the vocal cords , 1972 .

[18]  Wolfgang Hess,et al.  Accurate pitch determination of speech signals by means of a laryngograph , 1984, ICASSP.

[19]  Childers Dg Laryngeal pathology detection. , 1977 .

[20]  Donald G. Childers,et al.  Electroglottography for Laryngeal Function Assessment and Speech Analysis , 1984, IEEE Transactions on Biomedical Engineering.

[21]  D. G. Childers,et al.  Laryngeal Evaluation Using Features from Speech and the Electroglottograph , 1983, IEEE Transactions on Biomedical Engineering.

[22]  Thomas P. Barnwell Windowless Techniques for LPC Analysis , 1980 .

[23]  D G Childers Laryngeal pathology detection. , 1977, CRC critical reviews in bioengineering.

[24]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[25]  D. Childers,et al.  A critical review of electroglottography. , 1985, Critical reviews in biomedical engineering.