Automatic Speech Recognition: an Auditory Perspective

Keywords: speech Reference EPFL-REPORT-82481 Record created on 2006-03-10, modified on 2017-05-10

[1]  Harvey b. Fletcher,et al.  Speech and hearing in communication , 1953 .

[2]  Hynek Hermansky,et al.  Towards increasing speech recognition error rates , 1995, Speech Commun..

[3]  Mari Ostendorf,et al.  Context modeling with the stochastic segment model , 1992, IEEE Trans. Signal Process..

[4]  Jordan Cohen,et al.  Vocal tract normalization in speech recognition: Compensating for systematic speaker variability , 1995 .

[5]  A. Samuel,et al.  Whither speech recognition? , 1969, The Journal of the Acoustical Society of America.

[6]  H. Fletcher,et al.  Loudness, its definition, measurement and calculation. , 1933 .

[7]  Steven Greenberg,et al.  Stochastic perceptual models of speech , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[9]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[10]  Misha Pavel,et al.  Towards ASR on partially corrupted speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[11]  T. Houtgast,et al.  A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria , 1985 .

[12]  J. Lim Spectral root homomorphic deconvolution system , 1979, ICASSP.

[13]  Phil D. Green,et al.  Handling missing data in speech recognition , 1994, ICSLP.

[14]  M. Sanders Handbook of Sensory Physiology , 1975 .

[15]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[16]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[17]  B. Moore An Introduction to the Psychology of Hearing , 1977 .

[18]  Eric Fosler-Lussier,et al.  Speech recognition using on-line estimation of speaking rate , 1997, EUROSPEECH.

[19]  Hervé Bourlard,et al.  A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[20]  Roger K. Moore,et al.  Modelling asynchrony in speech using elementary single-signal decomposition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[21]  Hervé Bourlard,et al.  Using multiple time scales in a multi-stream speech recognition system , 1997, EUROSPEECH.

[22]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[23]  S. Furui On the role of spectral transition for speech perception. , 1986, The Journal of the Acoustical Society of America.

[24]  Charles C. Tappert,et al.  Memory and time improvements in a dynamic programming algorithm for matching speech patterns , 1978 .

[25]  Richard M. Schwartz,et al.  On-line cursive handwriting recognition using speech recognition methods , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  James Glass,et al.  Acoustic segmentation and phonetic classification in the SUMMIT system , 1988, International Conference on Acoustics, Speech, and Signal Processing,.

[27]  R. R. Riesz Differential Intensity Sensitivity of the Ear for Pure Tones , 1928 .

[28]  Mari Ostendorf,et al.  Context modeling with the stochastic segment model , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[29]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[30]  Louis C. W. Pols,et al.  Real-Time Recognition of Spoken Words , 1971, IEEE Transactions on Computers.

[31]  Jont B. Allen,et al.  How do humans process and recognize speech? , 1993, IEEE Trans. Speech Audio Process..

[32]  R. Plomp,et al.  Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.

[33]  W. R. Webster,et al.  Click-evoked response patterns of single units in the medial geniculate body of the cat. , 1966, Journal of neurophysiology.

[34]  Ch Chen,et al.  Pattern recognition and artificial intelligence , 1976 .

[35]  Oded Ghitza,et al.  Hidden Markov models with templates as non-stationary states: an application to speech recognition , 1993, Comput. Speech Lang..

[36]  Hynek Hermansky,et al.  Modulation Spectrum in Speech Processing , 1998 .

[37]  Misha Pavel,et al.  Intelligibility of speech with filtered time trajectories of spectral envelopes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[38]  Richard Lippmann,et al.  Speech recognition by machines and humans , 1997, Speech Commun..

[39]  P. Mermelstein,et al.  Distance measures for speech recognition, psychological and instrumental , 1976 .

[40]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[41]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[42]  L. A. Chistovich Central auditory processing of peripheral vowel spectra. , 1985, The Journal of the Acoustical Society of America.

[43]  J R Cohen,et al.  Application of an auditory model to speech recognition. , 1989, The Journal of the Acoustical Society of America.

[44]  S. S. Stevens On the psychophysical law. , 1957, Psychological review.

[45]  R. Plomp,et al.  Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.

[46]  Phil D. Green,et al.  Missing data techniques for robust speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[47]  Shuichi Itahashi,et al.  Automatic formant extraction utilizing mel scale and equal loudness contour , 1976, ICASSP.

[48]  K. Davis,et al.  Automatic Recognition of Spoken Digits , 1952 .

[49]  A. Prochazka,et al.  Signal Analysis and Prediction , 1998 .

[50]  Xiaodong Sun,et al.  Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states , 1994, IEEE Trans. Speech Audio Process..

[51]  Steven Greenberg,et al.  Performance improvements through combining phone- and syllable-scale information in automatic speech recognition , 1998, ICSLP.

[52]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[53]  Gregory J. Wolff,et al.  Lipreading by Neural Networks: Visual Preprocessing, Learning, and Sensory Integration , 1993, NIPS.

[54]  J. C. Stevens,et al.  Brightness and loudness as functions of stimulus duration , 1966 .

[55]  Nikki Mirghafori,et al.  Transmissions and transitions: a study of two common assumptions in multi-band ASR , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[56]  Peter F. Brown,et al.  The acoustic-modeling problem in automatic speech recognition , 1987 .

[57]  Richard M. Stern,et al.  Acoustical Pre-processing for Robust Speech Recognition , 1989, HLT.

[58]  Paul Mermelstein,et al.  Experiments in syllable-based recognition of continuous speech , 1980, ICASSP.

[59]  Douglas D. O'Shaughnessy,et al.  Speech communication : human and machine , 1987 .