ASR feature extraction with morphologically-filtered power-normalized cochleograms
暂无分享,去创建一个
Francisco J. Valverde-Albacete | Carmen Peláez-Moreno | Ascensión Gallardo-Antolín | Fernando de-la-Calle-Silos
[1] W. Jesteadt,et al. Forward masking as a function of frequency, masker level, and signal delay. , 1982, The Journal of the Acoustical Society of America.
[2] Birger Kollmeier,et al. Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition , 2011, Speech Commun..
[3] Brian R Glasberg,et al. Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.
[4] Georg v. Békésy,et al. On the Resonance Curve and the Decay Period at Various Points on the Cochlear Partition , 1949 .
[5] Marc René Schädler,et al. Comparing Different Flavors of Spectro-Temporal Features for ASR , 2011, INTERSPEECH.
[6] G. Matheron,et al. THE BIRTH OF MATHEMATICAL MORPHOLOGY , 2002 .
[7] B. Moore,et al. A revised model of loudness perception applied to cochlear hearing loss , 2004, Hearing Research.
[8] K.K. Paliwal,et al. Auditory masking based acoustic front-end for robust speech recognition , 1997, TENCON '97 Brisbane - Australia. Proceedings of IEEE TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications (Cat. No.97CH36162).
[9] Francisco J. Valverde-Albacete,et al. Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement , 2013, Cognitive Computation.
[10] R. Patterson,et al. Complex Sounds and Auditory Images , 1992 .
[11] Richard M. Schwartz,et al. Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.
[12] E. B. Newman,et al. A Scale for the Measurement of the Psychological Magnitude Pitch , 1937 .
[13] Edward R. Dougherty,et al. Hands-on Morphological Image Processing , 2003 .
[14] Carmen Peláez-Moreno,et al. Morphological Processing of Spectrograms for Speech Enhancement , 2011, NOLISP.
[15] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[16] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[17] Diego H. Milone,et al. Bioinspired sparse spectro-temporal representation of speech for robust classification , 2012, Comput. Speech Lang..
[18] Richard M. Stern,et al. Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[19] E Zwicker,et al. Inverse frequency dependence of simultaneous tone-on-tone masking patterns at low levels. , 1982, The Journal of the Acoustical Society of America.
[20] Richard M. Stern,et al. Hearing Is Believing: Biologically Inspired Methods for Robust Automatic Speech Recognition , 2012, IEEE Signal Processing Magazine.
[21] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[22] John H. L. Hansen,et al. Morphological constrained feature enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and Lombard effect , 1994, IEEE Trans. Speech Audio Process..
[23] Volker Hohmann,et al. Acoustic features for speech recognition based on Gammatone filterbank and instantaneous frequency , 2011, Speech Commun..
[24] Martin Heckmann,et al. A hierarchical framework for spectro-temporal feature extraction , 2011, Speech Commun..
[25] Serajul Haque. Utilizing auditory masking in automatic speech recognition , 2010, 2010 International Conference on Audio, Language and Image Processing.
[26] Tuomas Virtanen,et al. Modelling spectro-temporal dynamics in factorisation-based noise-robust automatic speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] L. Carney,et al. A phenomenological model for the responses of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression. , 2001, The Journal of the Acoustical Society of America.
[28] Yi Hu,et al. Incorporating a psychoacoustical model in frequency domain speech enhancement , 2004, IEEE Signal Processing Letters.
[29] Birger Kollmeier,et al. Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition , 2012, INTERSPEECH.
[30] Pascal Scalart,et al. Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[31] Richard M. Stern,et al. Physiologically-motivated synchrony-based processing for robust automatic speech recognition , 2006, INTERSPEECH.