Speech Signal Processing and Feature Extraction

Speech signal processing and feature extraction is the initial stage of any speech recognition system; it is through this component that the system views the speech signal itself. This chapter introduces general approaches to signal processing and feature extraction and surveys the techniques currently available in these areas.

[1]  A.P. Benguerel,et al.  Speech analysis , 1981, Proceedings of the IEEE.

[2]  N. Miller Pitch detection by data reduction , 1975 .

[3]  Nilo A Lindgren,et al.  Machine recognition of human language Part II - Theoretical models of speech perception and language , 1965, IEEE Spectrum.

[4]  A. Oppenheim,et al.  Homomorphic analysis of speech , 1968 .

[5]  Gunnar Fant,et al.  Speech sounds and features , 1973 .

[6]  Noam Chomsky,et al.  The Sound Pattern of English , 1968 .

[7]  J. Makhoul,et al.  The Use of a Two-Pole Linear Prediction Model in Speech Recognition , 1973 .

[8]  Lawrence R. Rabiner,et al.  Applications of a nonlinear smoothing algorithm to speech processing , 1975 .

[9]  H. Wakita Normalization of vowels by vocal-tract length and its application to vowel identification , 1977 .

[10]  J. Flanagan Speech Analysis, Synthesis and Perception , 1971 .

[11]  C. K. Yuen,et al.  Theory and Application of Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[12]  James L. Flanagan,et al.  A Mathematical Formulation and Comparison of Zero-Crossing Analysis Techniques which have been Applied to Automatic Speech Recognition , 1975 .

[13]  Janet MacIver Baker A New Time-Domain Analysis of Human Speech and Other Complex Waveforms. , 1975 .

[14]  A. Gray,et al.  Distance measures for speech processing , 1976 .

[15]  L. Rabiner,et al.  System for automatic formant analysis of voiced speech. , 1970, The Journal of the Acoustical Society of America.

[16]  M. Mathews,et al.  Pitch Synchronous Analysis of Voiced Sounds , 1961 .

[17]  M. Schroeder Period histogram and product spectrum: new methods for fundamental-frequency measurement. , 1968, The Journal of the Acoustical Society of America.

[18]  P. Denes The Speech Chain , 1963 .

[19]  Waveforms Hisashi Wakita Direct Estimation of the Vocal Tract Shape by Inverse Filtering of Acoustic Speech , 1973 .

[20]  Lawrence R. Rabiner,et al.  A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition , 1976 .

[21]  J. Markel,et al.  The SIFT algorithm for fundamental frequency estimation , 1972 .

[22]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[23]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[24]  L. R. Rabiner,et al.  A speaker-independent digit-recognition system , 1975, The Bell System Technical Journal.

[25]  Bede Liu,et al.  Digital Signal Processing: Theory, Design, and Implementation , 1976 .

[26]  Louis C. W. Pols,et al.  Real-Time Recognition of Spoken Words , 1971, IEEE Transactions on Computers.

[27]  H. G. Booker,et al.  Atmospheric research and electromagnetic telecommunication — Part I , 1965, IEEE Spectrum.

[28]  H. Wakita Estimation of vocal-tract shapes from acoustical analysis of the speech wave: The state of the art , 1979 .

[29]  M. Ross,et al.  Average magnitude difference function pitch extractor , 1974 .

[30]  Dennis H. Klatt,et al.  A digital filter bank for spectral matching , 1976, ICASSP.

[31]  S. McCandless,et al.  An algorithm for automatic formant extraction using linear prediction spectra , 1974 .

[32]  John Makhoul,et al.  Spectral linear prediction: Properties and applications , 1975 .

[33]  E Paulus,et al.  Automatic speech recognition using psychoacoustic models. , 1979, The Journal of the Acoustical Society of America.

[34]  Pietro Laface,et al.  Automatic detection and description of syllabic features in continuous speech , 1976 .

[35]  W. Hess,et al.  A pitch-synchronous digital feature extraction system for phonemic recognition of speech , 1976 .

[36]  S. Seneff Modifications to formant tracking algorithm of april 1974 , 1976 .

[37]  A M Liberman,et al.  Perception of the speech code. , 1967, Psychological review.

[38]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[39]  Ronald W. Schafer,et al.  Real-time digital hardware pitch detector , 1976 .

[40]  James L. Flanagan Automatic Extraction of Formant Frequencies from Continuous Speech , 1955 .

[41]  B. Atal,et al.  Speech analysis and synthesis by linear prediction of the speech wave. , 1971, The Journal of the Acoustical Society of America.

[42]  D. Reddy Segmentation of Speech Sounds , 1966 .

[43]  Harvey F. Silverman,et al.  An introduction to programming the Winograd Fourier transform algorithm (WFTA) , 1977 .

[44]  John E. Markel,et al.  Linear Prediction of Speech , 1976, Communication and Cybernetics.

[45]  Pierre Jules Louis Edmond Vicens,et al.  Aspects of speech recognition by computer , 1969 .

[46]  Bishnu S. Atal,et al.  Linear prediction analysis of speech based on a pole-zero representation. , 1975, The Journal of the Acoustical Society of America.

[47]  M. Sondhi,et al.  New methods of pitch extraction , 1968 .

[48]  E. E. David,et al.  Human communication : a unified view , 1972 .

[49]  B Gold,et al.  Parallel processing techniques for estimating pitch periods of speech in the time domain. , 1969, The Journal of the Acoustical Society of America.

[50]  L. Gerstman Classification of self-normalized vowels , 1968 .

[51]  Iris Kameny,et al.  Automatic acoustic-phonetic analysis of vowels and sonorants , 1976, ICASSP.

[52]  K. Stevens,et al.  Reduction of Speech Spectra by Analysis‐by‐Synthesis Techniques , 1961 .

[53]  C. L. Searle,et al.  Stop consonant discrimination based on human audition. , 1979, The Journal of the Acoustical Society of America.

[54]  J. Makhoul,et al.  Linear Prediction and the Spectral Analysis of Speech , 1972 .

[55]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[56]  S. Seneff,et al.  Real-time harmonic pitch detector , 1978 .

[57]  Richard A. Gillmann,et al.  A fast frequency domain pitch algorithm , 1975 .

[58]  R. Niederjohn A mathematical formulation and comparison of zero-crossing analysis techniques which have been applied to automatioc speech recognition , 1975 .

[59]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[60]  Lawrence R. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1975, Bell Syst. Tech. J..