Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates
暂无分享,去创建一个
[1] Austin F. Frank,et al. Analyzing linguistic data: a practical introduction to statistics using R , 2010 .
[2] Daniel Jurafsky,et al. Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates , 2008, ACL.
[3] Sadaoki Furui,et al. Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance , 2008, Comput. Speech Lang..
[4] Andreas Stolcke,et al. Recent innovations in speech-to-text transcription at SRI-ICSI-UW , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[5] P. Boersma. Praat : doing phonetics by computer (version 4.4.24) , 2006 .
[6] Mark J. F. Gales,et al. Automatic transcription of conversational telephone speech , 2005, IEEE Transactions on Speech and Audio Processing.
[7] Alicia B. Wassink,et al. Pacific northwest vowels: A Seattle neighborhood dialect study , 2005 .
[8] Mark J. F. Gales,et al. Training LVCSR systems on thousands of hours of data , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[9] Lori Lamel,et al. Do speech recognizers prefer female speakers? , 2005, INTERSPEECH.
[10] E. Ziegel. Permutation, Parametric, and Bootstrap Tests of Hypotheses (3rd ed.) , 2005 .
[11] Xunying Liu,et al. Development of the 2004 CU-HTK English CTS systems using more than two thousand hours of data , 2004 .
[12] Julia Hirschberg,et al. Prosodic and other cues to speech recognition failures , 2004, Speech Commun..
[13] Mark J. F. Gales,et al. Development of the 2003 CU-HTK conversational telephone speech transcription system , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[14] Rosalind Temple,et al. Phonetic Interpretation Papers in Laboratory Phonology VI: Acknowledgements , 2004 .
[15] Andreas Stolcke,et al. Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures , 2003, NAACL.
[16] Andreas Stolcke,et al. Prosodic knowledge sources for automatic speech recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[17] Dan Jurafsky,et al. Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. , 2003, The Journal of the Acoustical Society of America.
[18] Paul Boersma,et al. Praat: doing phonetics by computer , 2003 .
[19] Taehong Cho,et al. Domain-initial articulatory strengthening in four languages , 2003 .
[20] Mary P. Harper,et al. The SuperARV Language Model: Investigating the Effectiveness of Tightly Integrating Multiple Knowledge Sources , 2002, EMNLP.
[21] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[22] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .
[23] Sadaoki Furui,et al. Error analysis using decision trees in spontaneous presentation speech recognition , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..
[24] M. Tanenhaus,et al. Time Course of Frequency Effects in Spoken-Word Recognition: Evidence from Eye Movements , 2001, Cognitive Psychology.
[25] S. Goldinger,et al. Phonetic priming, neighborhood activation, and PARSYN , 2000, Perception & psychophysics.
[26] Eric Fosler-Lussier,et al. Effects of speaking rate and word frequency on pronunciations in convertional speech , 1999, Speech Commun..
[27] P. Luce,et al. Probabilistic Phonotactics and Neighborhood Activation in Spoken Word Recognition , 1999 .
[28] Sumio Ohno,et al. On the effects of speech rate upon parameters of the command-response model for the fundamental frequency contours of speech , 1998, ICSLP.
[29] D. Pisoni,et al. Recognizing Spoken Words: The Neighborhood Activation Model , 1998, Ear and hearing.
[30] P. Keating,et al. Articulatory strengthening at edges of prosodic domains. , 1997, The Journal of the Acoustical Society of America.
[31] David B. Pisoni,et al. Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics , 1996, Speech Commun..
[32] R. P. Fahey,et al. On explaining certain male-female differences in the phonetic realization of vowel categories , 1996 .
[33] Adwait Ratnaparkhi,et al. A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.
[34] P. Good. Permutation, Parametric, and Bootstrap Tests of Hypotheses , 2005 .
[35] Richard M. Stern,et al. On the effects of speech rate in large vocabulary speech recognition systems , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[36] E. Shriberg,et al. Acoustic properties of disfluent repetitions , 1995 .
[37] A. Syrdal,et al. Applied speech technology , 1995 .
[38] David B. Pisoni,et al. Automatic measurement of speech recognition performance: a comparison of six speaker-dependent recognition devices☆ , 1987 .
[39] W. Marslen-Wilson. Functional parallelism in spoken word-recognition , 1987, Cognition.
[40] G. R. Doddington,et al. Computers: Speech recognition: Turning theory to practice: New ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today , 1981, IEEE Spectrum.
[41] A. E. Hieke. A Content-Processing View of Hesitation Phenomena , 1981 .
[42] D. Howes. On the interpretation of word frequency as a variable affecting speed of recognition. , 1954, Journal of experimental psychology.