Listening in the Dips: Comparing Relevant Features for Speech Recognition in Humans and Machines
暂无分享,去创建一个
[1] Michael I. Mandel,et al. Measuring time-frequency importance functions of speech with bubble noise. , 2016, The Journal of the Acoustical Society of America.
[2] Birger Kollmeier,et al. Autonomous measurement of speech intelligibility utilizing automatic speech recognition , 2015, INTERSPEECH.
[3] Martin Cooke,et al. Glimpsing speech , 2003, J. Phonetics.
[4] Birger Kollmeier,et al. Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features. , 2016, The Journal of the Acoustical Society of America.
[5] George Saon,et al. The IBM 2015 English conversational telephone speech recognition system , 2015, INTERSPEECH.
[6] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.
[7] Michael J. Carey,et al. A speech similarity distance weighting for robust recognition , 2005, INTERSPEECH.
[8] Geoffrey Zweig,et al. Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.
[9] Klaus-Robert Müller,et al. Interpretable deep neural networks for single-trial EEG classification , 2016, Journal of Neuroscience Methods.
[10] Louis D. Braida,et al. Human and machine consonant recognition , 2005, Speech Commun..
[11] Anna Warzybok,et al. The multilingual matrix test: Principles, applications, and comparison across languages: A review , 2015, International journal of audiology.
[12] DeLiang Wang,et al. The role of binary mask patterns in automatic speech recognition in background noise. , 2013, The Journal of the Acoustical Society of America.
[13] Martin Cooke,et al. A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.
[14] Frédéric Berthommier,et al. Masking release for consonant features in temporally fluctuating background noise , 2006, Hearing Research.
[15] B C Moore,et al. Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people. , 1998, The Journal of the Acoustical Society of America.
[16] Torsten Dau,et al. A multi-resolution envelope-power based model for speech intelligibility. , 2013, The Journal of the Acoustical Society of America.
[17] John R. Hershey,et al. Super-human multi-talker speech recognition: A graphical modeling approach , 2010, Comput. Speech Lang..
[18] Birger Kollmeier,et al. Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition , 2011, Speech Commun..
[19] G. A. Miller,et al. The Intelligibility of Interrupted Speech , 1948 .
[20] Bernd T. Meyer. What's the difference? comparing humans and machines on the Aurora 2 speech recognition task , 2013, INTERSPEECH.
[21] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[22] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[23] Birger Kollmeier,et al. Development and analysis of an International Speech Test Signal (ISTS) , 2010, International journal of audiology.