Human and automatic speech recognition in the presence of speech-intrinsic variations
暂无分享,去创建一个
[1] James Emil Flege,et al. Interaction between the native and second language phonetic subsystems , 2003, Speech Commun..
[2] L D Shriberg,et al. A procedure for phonetic transcription by consensus. , 1984, Journal of speech and hearing research.
[3] B. Kollmeier,et al. Effect of speech-intrinsic variations on human and automatic recognition of spoken phonemes. , 2011, The Journal of the Acoustical Society of America.
[4] Richard M. Stern,et al. On the effects of speech rate in large vocabulary speech recognition systems , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[5] Christian Kaernbach. A behavioral reverse correlation technique to decipher early auditory feature coding , 1999 .
[6] T Dau,et al. A quantitative model of the "effective" signal processing in the auditory system. I. Model structure. , 1996, The Journal of the Acoustical Society of America.
[7] Kathryn Woodcock,et al. Ergonomics and automatic speech recognition applications for deaf and hard-of-hearing users , 1997 .
[8] M. D. Wang,et al. Consonant confusions in noise: a study of perceptual features. , 1973, The Journal of the Acoustical Society of America.
[9] J. C. Steinberg,et al. Factors Governing the Intelligibility of Speech Sounds , 1945 .
[10] Jont B. Allen. How do humans process and recognize speech , 1993 .
[11] G. A. Miller,et al. An Analysis of Perceptual Confusions Among Some English Consonants , 1955 .
[12] Jean C. Krause,et al. Investigating alternative forms of clear speech: the effects of speaking rate and speaking mode on intelligibility. , 2002, The Journal of the Acoustical Society of America.
[13] Martin Heckmann,et al. A closer look on hierarchical spectro-temporal features (HIST) , 2008, INTERSPEECH.
[14] C. Schreiner,et al. Gabor analysis of auditory midbrain receptive fields: spectro-temporal and binaural composition. , 2003, Journal of neurophysiology.
[15] Richard M. Stern,et al. Analysis of physiologically-motivated signal processing for robust speech recognition , 2008, INTERSPEECH.
[16] Stephen V. David,et al. Representation of Phonemes in Primary Auditory Cortex: How the Brain Analyzes Speech , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[17] S.D. Peters,et al. On the limits of speech recognition in noise , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[18] Louis D. Braida,et al. Human and machine consonant recognition , 2005, Speech Commun..
[19] Frantisek Grézl,et al. Improved MLP structures for data-driven feature extraction for ASR , 2005, INTERSPEECH.
[20] Martin Cooke,et al. A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.
[21] S A Shamma,et al. Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. , 2001, Journal of neurophysiology.
[22] Tim Jürgens,et al. Modelling the human-machine gap in speech reception: microscopic speech intelligibility prediction for normal-hearing subjects with an auditory model , 2007, INTERSPEECH.
[23] J. C. Krause,et al. Acoustic properties of naturally produced clear speech at normal speaking rates. , 1996, The Journal of the Acoustical Society of America.
[24] Frank Joublin,et al. Hierarchical spectro-temporal features for robust speech recognition , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[25] J M Festen. Contributions of comodulation masking release and temporal resolution to the speech-reception threshold masked by an interfering voice. , 1993, The Journal of the Acoustical Society of America.
[26] Richard M. Stern,et al. Signal Processing for Robust Speech Recognition , 1994, HLT.
[27] Alexander Fischer,et al. Progress with the philips continuous ASR system on the Aurora 2 noisy digits database , 2002, INTERSPEECH.
[28] Alfred Mertins,et al. Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines , 2005, INTERSPEECH.
[29] B Kollmeier,et al. Development and evaluation of a German sentence test for objective and subjective speech intelligibility assessment. , 1997, The Journal of the Acoustical Society of America.
[30] Ernst Günter Schukat-Talamazzini. Statistische Spracherkennung , 1995, Künstliche Intell..
[31] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[32] Alfred Mertins,et al. Automatic speech recognition and speech variability: A review , 2007, Speech Commun..
[33] Dirk Van Compernolle,et al. Synthesizing speech from speech recognition parameters , 2004, INTERSPEECH.
[34] Florian Schiel,et al. Automatic detection and segmentation of pronunciation variants in German speech corpora , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[35] Birger Kollmeier,et al. Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition , 2011, Speech Commun..
[36] Joseph P. Olive,et al. Two protocols comparing human and machine phonetic recognition performance in conversational speech , 2008, INTERSPEECH.
[37] H. Levitt,et al. Predicting consonant confusions from acoustic analysis. , 1981, The Journal of the Acoustical Society of America.
[38] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[39] W. Dreschler,et al. Artificial noise signals with speechlike spectral and temporal properties for hearing instrument assessment , 1999 .
[40] T. Mcarthur,et al. The Oxford companion to the English language , 1994 .
[41] K. Kohler. Einführung in die Phonetik des Deutschen , 1981 .
[42] J. Hillenbrand,et al. Acoustic characteristics of American English vowels. , 1994, The Journal of the Acoustical Society of America.
[43] C W Turner,et al. Use of temporal envelope cues in speech recognition by normal and hearing-impaired listeners. , 1995, The Journal of the Acoustical Society of America.
[44] S. Phatak,et al. Consonant and Vowel confusions , 2006 .
[45] Kate Hunicke-Smith,et al. Effect of Speaking Style on LVCSR Performance , 1996 .
[46] Albert S. Bregman,et al. The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .
[47] Nelson Morgan,et al. Multi-stream spectro-temporal features for robust speech recognition , 2008, INTERSPEECH.
[48] Steve Young,et al. The HTK book , 1995 .
[49] Matthew H. Davis,et al. Leading Up the Lexical Garden Path: Segmentation and Ambiguity in Spoken Word Recognition , 2002 .
[50] E. Vajda. Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet , 2000 .
[51] Jean C. Krause,et al. The effects of speaking rate on the intelligibility of speech for various speaking modes , 1995 .
[52] Odette Scharenborg,et al. Parallels between HSR and ASR: how ASR can contribute to HSR , 2005, INTERSPEECH.
[53] R. G. Leonard,et al. A database for speaker-independent digit recognition , 1984, ICASSP.
[54] Birger Kollmeier,et al. Optimization and evaluation of Gabor feature sets for ASR , 2008, INTERSPEECH.
[55] Fosler-Lussier,et al. EFFECTS OF SPEAKING RATE AND WORD FREQUENCY ONCONVERSATIONAL PRONUNCIATIONSEric , 1999 .
[56] B. Kollmeier,et al. A HUMAN-MACHINE COMPARISON IN SPEECH RECOGNITION BASED ON A LOGATOME CORPUS , 2006 .
[57] Hynek Hermansky,et al. Noise resistant auditory model for parametrization of speech , 1997 .
[58] M. Kleinschmidt. Methods for capturing spectro-temporal modulations in automatic speech recognition , 2001 .
[59] Louis ten Bosch,et al. Bridging the gap between human and automatic speech recognition , 2007, Speech Commun..
[60] Jon Barker,et al. Modelling speaker intelligibility in noise , 2007, Speech Commun..
[61] Nelson Morgan,et al. Multi-stream to many-stream: using spectro-temporal features for ASR , 2009, INTERSPEECH.
[62] Phil D. Green,et al. Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..
[63] Jae Lim,et al. Signal estimation from modified short-time Fourier transform , 1984 .
[64] Valerie Hazan,et al. Acoustic-phonetic correlates of talker intelligibility for adults and children. , 2004, The Journal of the Acoustical Society of America.
[65] Richard Lippmann,et al. Speech recognition by machines and humans , 1997, Speech Commun..
[66] Hermann Ney,et al. Using phase spectrum information for improved speech recognition performance , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[67] Josef Kittler,et al. Floating search methods for feature selection with nonmonotonic criterion functions , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).
[68] Hynek Hermansky,et al. Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[69] R. Mühler,et al. Development of a Speaker Discrimination Test for Cochlear Implant Users Based on the Oldenburg Logatome Corpus , 2008, ORL.
[70] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[71] Jörn Anemüller,et al. Predictability of STRFs in auditory cortex neurons depends on stimulus class , 2008, INTERSPEECH.
[72] J Tchorz,et al. A model of auditory perception as front end for automatic speech recognition. , 1999, The Journal of the Acoustical Society of America.
[73] David Gelbart,et al. Improving word accuracy with Gabor feature extraction , 2002, INTERSPEECH.
[74] Petros Maragos,et al. Robust AM-FM features for speech recognition , 2005, IEEE Signal Processing Letters.
[75] T. Brand,et al. Microscopic prediction of speech recognition for listeners with normal hearing in noise using an auditory model. , 2009, The Journal of the Acoustical Society of America.
[76] Bernd T. Meyer,et al. The non-native consonant challenge for european languages , 2008, INTERSPEECH.
[77] Chi‐nin Li. Accent, intelligibility, and comprehensibility in the perception of foreign‐accented Lombard speech , 2003 .
[78] Michael Kleinschmidt,et al. Robust speech recognition based on spectro-temporal processing , 2002 .
[79] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..
[80] Hans Werner Strube,et al. Recognition of isolated words based on psychoacoustics and neurobiology , 1990, Speech Commun..
[81] S. Gelfand,et al. Consonant recognition in quiet as a function of aging among normal hearing subjects. , 1985, The Journal of the Acoustical Society of America.
[82] T. Gramss. Fast algorithms to find invariant features for a word recognizing neural net , 1991 .
[83] W. Dreschler,et al. ICRA noises: artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment. International Collegium for Rehabilitative Audiology. , 2001, Audiology : official organ of the International Society of Audiology.
[84] Birger Kollmeier,et al. Complementarity of MFCC, PLP and Gabor features in the presence of speech-intrinsic variabilities , 2009, INTERSPEECH.
[85] Alfred Mertins,et al. Introduction to the Special Issue on Intrinsic Speech Variations , 2007, Speech Commun..
[86] Richard M. Stern,et al. Towards fusion of feature extraction and acoustic model training: a top down process for robust speech recognition , 2009, INTERSPEECH.
[87] Michael Kleinschmidt,et al. Localized spectro-temporal features for automatic speech recognition , 2003, INTERSPEECH.
[88] J C Junqua,et al. The Lombard reflex and its role on human listeners and automatic speech recognizers. , 1993, The Journal of the Acoustical Society of America.
[89] Melvyn J. Hunt,et al. Spectral Signal Processing for ASR , 2007 .
[90] Hynek Hermansky,et al. Should recognizers have ears? , 1998, Speech Commun..
[91] B E Walden,et al. Evaluating the articulation index for auditory-visual consonant recognition. , 1996, The Journal of the Acoustical Society of America.
[92] Birger Kollmeier,et al. Phoneme confusions in human and automatic speech recognition , 2007, INTERSPEECH.
[93] Tony Ezzat,et al. Spectro-temporal analysis of speech using 2-d Gabor filters , 2007, INTERSPEECH.
[94] Odette Scharenborg,et al. Reaching over the gap: A review of efforts to link human and automatic speech recognition research , 2007, Speech Commun..