Anthropomorphic feature extraction algorithm for speech recognition in adverse environments

Speech recognition engines should remain reasonably accurate in adverse environments in order to find their ways from laboratories towards applications. However the human auditory system has been proven to be a versatile tool, which is capable of outperforming the known artificial algorithms in their target environments. Recent advances in psychoacoustics and auditory physiology pointed to the essentially non-linear behaviour of the auditory apparatus. On the basis of the interpretation of the biological information processing it is possible to construct a parametric “human-like” nonlinear algorithm, which exhibit properties similar to those of the live system. Besides the description of the anthropomorphic feature extraction algorithm in this paper we test its performance in accordance with the formulated requirements to the efficient and robust feature extraction and also provide a comparative benchmark of compact ASR system in combination with the proposed algorithm in adverse conditions.

[1]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[2]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[3]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[4]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[5]  Richard F. Lyon,et al.  ALL-POLE MODELS OF AUDITORY FILTERING , 1997 .

[6]  S. Neely From Sound to Synapse: Physiology of the Mammalian Ear , 1998 .

[7]  Oded Ghitza Auditory models and human performance in tasks related to speech coding and speech recognition , 1994 .

[8]  R. Meddis Simulation of mechanical to neural transduction in the auditory receptor. , 1986, The Journal of the Acoustical Society of America.

[9]  A. V. Ivanov,et al.  SPIKING NEURON AUDITORY MODEL FOR SPEECH PROCESSING SYSTEMS , 2002 .

[10]  A. Descloux,et al.  1982/83 End office connection study: Analog voice and voiceband data transmission performance characterization of the public switched network , 1984, AT&T Bell Laboratories Technical Journal.

[11]  S.D. Peters,et al.  On the limits of speech recognition in noise , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[12]  Philip H. Ramsey Nonparametric Statistical Methods , 1974, Technometrics.

[13]  Alexander A. Petrovsky,et al.  Analysis of the IHC Adaptation for the Anthropomorphic Speech Processing Systems , 2005, EURASIP J. Adv. Signal Process..

[14]  S. Furui,et al.  AN ASSESSMENT OF AUTOMATIC RECOGNITION TECHNIQUES FOR SPONTANEOUS SPEECH IN COMPARISON WITH HUMAN PERFORMANCE , 2002 .

[15]  Yasuo Ariki,et al.  Noisy speech recognition using noise reduction method based on Kalman filter , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[16]  D. D. Greenwood A cochlear frequency-position function for several species--29 years later. , 1990, The Journal of the Acoustical Society of America.

[17]  P. Coleman,et al.  Experiments in hearing , 1961 .

[18]  D. Wolfe,et al.  Nonparametric Statistical Methods. , 1974 .

[19]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[20]  Richard Lippmann,et al.  Speech recognition by machines and humans , 1997, Speech Commun..

[21]  Michael Picheny,et al.  Robust speech recognition in noise --- performance of the IBM continuous speech recogniser on the ARPA noise spoke task , 1995 .

[22]  A Robert,et al.  A composite model of the auditory periphery for simulating responses to complex sounds. , 1999, The Journal of the Acoustical Society of America.

[23]  Alexander A. Petrovsky,et al.  A Composite Physiological Model of the Inner Ear for Audio Coding , 2004 .

[24]  Simon Haykin,et al.  Adaptive filter theory (2nd ed.) , 1991 .

[25]  Steve Rogers,et al.  Adaptive Filter Theory , 1996 .

[26]  Ray Meddis,et al.  Adaptation in a revised inner-hair cell model. , 2003, The Journal of the Acoustical Society of America.