An Auditory Inspired Amplitude Modulation Filter Bank for Robust Feature Extraction in Automatic Speech Recognition
暂无分享,去创建一个
[1] Jan Cernocký,et al. Probabilistic and Bottle-Neck Features for LVCSR of Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[2] Sarel van Vuuren,et al. Data-driven design of RASTA-like filters , 1997, EUROSPEECH.
[3] Frédéric E. Theunissen,et al. The Modulation Transfer Function for Speech Intelligibility , 2009, PLoS Comput. Biol..
[4] Hermann Ney,et al. Deep hierarchical bottleneck MRASTA features for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[5] Steve Renals,et al. WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[6] Richard M. Stern,et al. Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[7] Sarel van Vuuren,et al. Data based filter design for RASTA-like channel normalization in ASR , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[8] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[9] Tim Jürgens,et al. NOISE ROBUST DISTANT AUTOMATIC SPEECH RECOGNITION UTILIZING NMF BASED SOURCE SEPARATION AND AUDITORY FEATURE EXTRACTION , 2013 .
[10] Jeff A. Bilmes,et al. MVA Processing of Speech Features , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Haizhou Li,et al. Normalization of the Speech Modulation Spectra for Robust Speech Recognition , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[12] R. Plomp,et al. Effect of temporal envelope smearing on speech reception. , 1994, The Journal of the Acoustical Society of America.
[13] Hynek Hermansky,et al. Multi-resolution RASTA filtering for TANDEM-based ASR , 2005, INTERSPEECH.
[14] Keith Vertanen. Baseline Wsj Acoustic Models for Htk and Sphinx : Training Recipes and Recognition Experiments , 2007 .
[15] B. Kollmeier,et al. Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers. , 1997, The Journal of the Acoustical Society of America.
[16] Birger Kollmeier,et al. Estimation of the signal-to-noise ratio with amplitude modulation spectrograms , 2002, Speech Commun..
[17] Tomohiro Nakatani,et al. The reverb challenge: A common evaluation framework for dereverberation and recognition of reverberant speech , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[18] Daniel P. W. Ellis,et al. Frequency-domain linear prediction for temporal features , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).
[19] R. Plomp,et al. Effect of reducing slow temporal modulations on speech reception. , 1994, The Journal of the Acoustical Society of America.
[20] Birger Kollmeier,et al. Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] J. Foote,et al. WSJCAM0: A BRITISH ENGLISH SPEECH CORPUS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , 1995 .
[22] G. Rose,et al. Sensitivity to amplitude modulated sounds in the anuran auditory nervous system. , 1985, Journal of neurophysiology.
[23] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..
[24] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[25] T. Yin,et al. Responses to amplitude-modulated tones in the auditory nerve of the cat. , 1992, The Journal of the Acoustical Society of America.
[26] D. Grantham,et al. Modulation masking: effects of modulation frequency, depth, and phase. , 1989, The Journal of the Acoustical Society of America.
[27] Jeih-Weih Hung,et al. Optimization of temporal filters for constructing robust features in speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[28] Michael Kleinschmidt,et al. Localized spectro-temporal features for automatic speech recognition , 2003, INTERSPEECH.
[29] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .
[30] B. Kollmeier,et al. Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition. , 2012, The Journal of the Acoustical Society of America.
[31] Hynek Hermansky,et al. Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[32] Birger Kollmeier,et al. On the use of spectro-temporal features for the IEEE AASP challenge ‘detection and classification of acoustic scenes and events’ , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[33] C. Schreiner,et al. Gabor analysis of auditory midbrain receptive fields: spectro-temporal and binaural composition. , 2003, Journal of neurophysiology.
[34] Hermann Ney,et al. Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both? , 2012, INTERSPEECH.
[35] S. Furui,et al. Speaker-independent isolated word recognition based on emphasized spectral dynamics , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[36] C E Schreiner,et al. Neural processing of amplitude-modulated sounds. , 2004, Physiological reviews.
[37] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[38] T. Houtgast. Frequency selectivity in amplitude-modulation detection. , 1989, The Journal of the Acoustical Society of America.
[39] B. Kollmeier,et al. Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction. , 1994, The Journal of the Acoustical Society of America.
[40] C. Schreiner,et al. Periodicity coding in the inferior colliculus of the cat. I. Neuronal mechanisms. , 1988, Journal of neurophysiology.
[41] N. Viemeister. Temporal modulation transfer functions based upon modulation thresholds. , 1979, The Journal of the Acoustical Society of America.
[42] Hynek Hermansky,et al. Temporal envelope compensation for robust phoneme recognition using modulation spectrum. , 2010, The Journal of the Acoustical Society of America.
[43] E. Evans. Place and time coding of frequency in the peripheral auditory system: some physiological pros and cons. , 1978, Audiology : official organ of the International Society of Audiology.
[44] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[45] Jean-Marc Boite,et al. Nonlinear discriminant analysis for improved speech recognition , 1997, EUROSPEECH.