Dynamic pronunciation models for automatic speech recognition
暂无分享,去创建一个
[1] C. Fowler,et al. Talkers' signaling of new and old. words in speech and listeners' perception and use of the distinction , 1987 .
[2] Andreas Stolcke,et al. Multiple-pronunciation lexical modeling in a speaker independent speech understanding system , 1994, ICSLP.
[3] Corey Miller,et al. Pronunciation modeling in speech synthesis , 1998 .
[4] Katrin Kirchhoff. Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments , 1998, ICSLP.
[5] David B. Pisoni,et al. Text-to-speech: the mitalk system , 1987 .
[6] D. Pisoni,et al. Perception of the duration of rapid spectrum changes in speech and nonspeech signals , 1983, Perception & psychophysics.
[7] Patti Price,et al. The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[8] Harriet J. Nock,et al. Pronunciation modeling by sharing gaussian densities across phonetic models , 1999, EUROSPEECH.
[9] James H. Martin,et al. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.
[10] Gitta P. M. Laan. The contribution of intonation, segmental durations, and spectral features to the perception of a spontaneous and a read speaking style , 1997, Speech Commun..
[11] P. Lieberman. Some Effects of Semantic and Grammatical Context on the Production and Perception of Speech , 1963 .
[12] Jonathan G. Fiscus,et al. A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[13] Daniel Jurafsky,et al. Building multiple pronunciation models for novel words using exploratory computational phonology , 1995, EUROSPEECH.
[14] Lori Lamel,et al. On designing pronunciation lexicons for large vocabulary continuous speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[15] M. A. Randolph. A data-driven method for discovering and predicting allophonic variation , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[16] Don McAllaster,et al. Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch , 1998, ICSLP.
[17] Bruce Tesar,et al. Computational optimality theory , 1996 .
[18] Lotfi A. Zadeh,et al. Phonological structures for speech recognition , 1989 .
[19] Kathleen J. Mullen,et al. Agricultural Policies in India , 2018, OECD Food and Agricultural Reviews.
[20] Robert F. Port,et al. The influence of tempo on stop closure duration as a cue for voicing and place , 1979 .
[21] G. Ayers. Discourse functions of pitch range in spontaneous and read speech , 1994 .
[22] Steven Greenberg,et al. ON THE ORIGINS OF SPEECH INTELLIGIBILITY IN THE REAL WORLD , 1997 .
[23] P. Ladefoged,et al. Phonetic linguistics : essays in honor of Peter Ladefoged , 1987 .
[24] F. Goldman-Eisler,et al. Sequential Temporal Patterns in Spontaneous Speech , 1966 .
[25] Helmer Strik,et al. Modeling pronunciation variation for a dutch CSR: testing three methods , 1998, ICSLP.
[26] Hervé Bourlard,et al. CDNN: a context dependent neural network for continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[27] Donald J. Sharf,et al. Phonetic Analysis of Normal and Abnormal Speech , 1991 .
[28] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.
[29] Christian-Michael Westendorf,et al. Learning pronunciation dictionary from speech data , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[30] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[31] Torbjørn Svendsen,et al. Maximum likelihood modelling of pronunciation variation , 1999, Speech Commun..
[32] Yochai Konig,et al. REMAP: Recursive Estimation and Maximization of A Posteriori Probabilities - Application to Transition-Based Connectionist Speech Recognition , 1995, NIPS.
[33] A. Liberman,et al. Some effects of later-occurring information on the perception of stop consonant and semivowel , 1979, Perception & psychophysics.
[34] Florien J. van Beinum. Spectro-temporal reduction and expansion in spontaneous speech and read text: the role of focus words , 1990, ICSLP.
[35] Kuldip K. Paliwal,et al. Automatic Speech and Speaker Recognition: Advanced Topics , 1999 .
[36] Robert I. Damper,et al. A recurrent network that learns to pronounce English text , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[37] Stephen Cox,et al. A comparison of two unsupervised approaches to accent identification , 1998, ICSLP.
[38] Detlef Koll,et al. Modeling and efficient decoding of large vocabulary conversational speech , 1999, EUROSPEECH.
[39] George Zavaliagkos,et al. Pronunciation modeling for large vocabulary conversational speech recognition , 1998, ICSLP.
[40] Steven Bird,et al. One-Level Phonology: Autosegmental Representations and Rules as Finite Automata , 1994, Comput. Linguistics.
[41] Andrej Ljolje,et al. Automatic Generation of Detailed Pronunciation Lexicons , 1996 .
[42] Martin Kay,et al. Regular Models of Phonological Rule Systems , 1994, CL.
[43] W. Labov. Principles of Linguistic Change: Internal Factors , 1994 .
[44] Jean-Pierre Martens,et al. On the use of pronunciation rules for improved word recognition , 1995, EUROSPEECH.
[45] Francine R. Chen,et al. Computational Models of American Speech , 1992 .
[46] Ellen M. Kaisse. Connected Speech: The Interaction of Syntax and Phonology , 1985 .
[47] Jean-Pierre Martens,et al. A fast and reliable rate of speech detector , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[48] Xuedong Huang,et al. Improvements on a trainable letter-to-sound converter , 1997, EUROSPEECH.
[49] J. Wolf,et al. The HWIM speech understanding system , 1977 .
[50] R. A. Sharman,et al. A bi-directional model of English pronunciation , 1991, EUROSPEECH.
[51] Richard Sproat,et al. Compilation of Weighted Finite-State Transducers from Decision Trees , 1996, ACL.
[52] J. Friedman,et al. Computer exploration of fast-speech rules , 1975 .
[53] Steven Greenberg,et al. Speaking in shorthand - A syllable-centric perspective for understanding pronunciation variation , 1999, Speech Commun..
[54] Roland Kuhn,et al. Rescoring multiple pronunciations generated from spelled words , 1998, ICSLP.
[55] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[56] Rosaria Silipo,et al. AUTOMATIC TRANSCRIPTION OF PROSODIC STRESS FOR SPONTANEOUS ENGLISH DISCOURSE , 1999 .
[57] Anthony J. Robinson,et al. Context-Dependent Classes in a Hybrid Recurrent Network-HMM Speech Recognition System , 1995, NIPS.
[58] Mitch Weintraub,et al. Automatic Learning of Word Pronunciation from Data , 1996 .
[59] W. Nick Campbell. Syllable-level duration determination , 1989, EUROSPEECH.
[60] Fergus McInnes,et al. Use of acoustic sentence level and lexical stress in HSMM speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[61] James R. Glass,et al. Empirical acquisition of word and phrase classes in the atis domain , 1993, EUROSPEECH.
[62] Alexander H. Waibel,et al. Dictionary learning for spontaneous speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[63] Robert L. Mercer,et al. An information theoretic approach to the automatic determination of phonemic baseforms , 1984, ICASSP.
[64] Daniel Gildea,et al. Forms of English Function Words — Effects of Disfluencies , Turn Position , Age and Sex , and Predictability , 1999 .
[65] Alex Waibel,et al. Modeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode , 1999 .
[66] Harriet J. Nock,et al. Detecting and correcting poor pronunciations for multiword units , 1998 .
[67] C. Pollard,et al. Center for the Study of Language and Information , 2022 .
[68] Fernando Pereira,et al. Transducer composition for context-dependent network expansion , 1997, EUROSPEECH.
[69] Alex Waibel,et al. Flexible transcription alignment , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[70] N. Morgan,et al. INCORPORATING CONTEXTUAL PHONETICS INTO AUTOMATIC SPEECH RECOGNITION , 1999 .
[71] Noam Chomsky,et al. The Sound Pattern of English , 1968 .
[72] Michael Galler,et al. On the use of stochastic inference networks for representing multiple word pronunciations , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[73] T. Crystal,et al. Segmental durations in connected‐speech signals: Current results , 1988 .
[74] Ronald A. Cole,et al. Automatically generated word pronunciations from phoneme classifier output , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[75] William J. Byrne,et al. Stochastic pronunciation modelling from hand-labelled phonetic corpora , 1999, Speech Commun..
[76] Steve R. Waterhouse,et al. Transcription of broadcast television and radio news: the 1996 ABBOT system , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[77] Ellen Eide. Automatic modeling of pronunciation variations , 1999, EUROSPEECH.
[78] William D. Raymond,et al. Reduction of English function words in switchboard , 1998, ICSLP.
[79] Joseph Picone,et al. Improved surname pronunciations using decision trees , 1998, ICSLP.
[80] Nelson Morgan,et al. Perceptually inspired signal processing strategies for robust speech recognition in reverberant environments , 1998 .
[81] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[82] Satoshi Kobayashi,et al. Extraction and representation rhythmic components of spontaneous speech , 1997, EUROSPEECH.
[83] I. Good. THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS , 1953 .
[84] Michael Picheny,et al. Automatic phonetic baseform determination , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[85] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[86] J L Miller,et al. How the components of speaking rate influence perception of phonetic segments. , 1981, Journal of experimental psychology. Human perception and performance.
[87] Brian Kingsbury,et al. An Overview of the SPRACH System for the Transcription of Broadcast News , 1999 .
[88] Hy Murveit,et al. Linguistic constraints in hidden Markov model based speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[89] Florian Schiel. A new approach to speaker adaptation by modelling pronunciation in automatic speech recognition , 1993, Speech Commun..
[90] C Soares,et al. The influence of inter- and intra-speaker tempo on fundamental frequency and palatalization. , 1983, The Journal of the Acoustical Society of America.
[91] V. Zue,et al. The role of phonological rules in speech understanding research , 1975 .
[92] H. Levin,et al. The Prosodic and Paralinguistic Features of Reading and Telling Stories , 1982 .
[93] Lalit R. Bahl,et al. Recognition of continuously read natural corpus , 1978, ICASSP.
[94] Eric Fosler-Lussier,et al. Towards robustness to fast speech in ASR , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[95] Alexander H. Waibel,et al. Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition , 1997, EUROSPEECH.
[96] Frederick Jelinek,et al. Interpolated estimation of Markov source parameters from sparse data , 1980 .
[97] Richard M. Stern,et al. The 1996 Hub-4 Sphinx-3 System , 1997 .
[98] W. Ganong. Phonetic categorization in auditory word perception. , 1980, Journal of experimental psychology. Human perception and performance.
[99] Eric Fosler-Lussier,et al. Multi-level decision trees for static and dynamic pronunciation models , 1999, EUROSPEECH.
[100] J L Miller,et al. Some effects of speaking rate on the production of /b/ and /w/. , 1983, The Journal of the Acoustical Society of America.
[101] Jason J. Humphries. Accent modelling and adaptation in automatic speech recognition , 1998 .
[102] Eric Fosler-Lussier,et al. Fast speakers in large vocabulary continuous speech recognition: analysis & antidotes , 1995, EUROSPEECH.
[103] Eric Fosler-Lussier,et al. Not just what, but also when: Guided automatic pronunciation modeling for Broadcast News , 1999 .
[104] Jonathan G. Fiscus,et al. 1998 Broadcast News Benchmark Test Results: English and Non-English Word Error Rate Performance Measures , 1998 .
[105] Horacio Franco,et al. Hybrid neural network/hidden Markov model continuous-speech recognition , 1992, ICSLP.
[106] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..
[107] Joseph Picone,et al. An advanced system to generate pronunciations of proper nouns , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[108] Q. Summerfield. Articulatory rate and perceptual constancy in phonetic perception. , 1981, Journal of experimental psychology. Human perception and performance.
[109] Mitch Weintraub,et al. WS96 project report: Automatic learning of word pronunciation from data , 1997 .
[110] Victor Zue,et al. Statistical and linguistic analyses of F0 in read and spontaneous speech , 1992, ICSLP.
[111] Terrence J. Sejnowski,et al. Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..
[112] Eric Fosler-Lussier,et al. Combining multiple estimators of speaking rate , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[113] Jean-Pierre Martens,et al. In Search of Pronunciation Rules , 1998 .
[114] Daniel Jurafsky,et al. Learning Phonological Rule Probabilities from Speech Corpora with Exploratory Computational Phonology , 1995, ACL.
[115] Gethin Williams,et al. Knowing What You Don't Know: Roles for Confidence Measures in Automatic Speech Recognition , 1999 .
[116] Kenneth Ward Church. Phonological parsing in speech recognition , 1987 .
[117] Michael Riley,et al. A statistical model for generating pronunciation networks , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[118] Shigeki Sagayama,et al. Phoneme environment clustering for speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[119] Steve Renals,et al. DECODER TECHNOLOGY FOR CONNECTIONIST LARGE VOCABULARY SPEECH RECOGNITION , 1995 .
[120] T. Mark Ellison,et al. Phonological Derivation in Optimality Theory , 1994, COLING.
[121] Florian Schiel,et al. Statistical Modelling Of Pronunciation: It's Not The Model, It's The Data , 1998 .
[122] William M. Fisher. A statistical text-to-phone function using ngrams and rules , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[123] Jason Eisner,et al. Eecient Generation in Primitive Optimality Theory , 1997 .
[124] Yoshinori Sagisaka,et al. Automatic generation of multiple pronunciations based on neural networks , 1999, Speech Commun..
[125] Bruce T. Lowerre,et al. The HARPY speech recognition system , 1976 .
[126] Lennart Nord,et al. Prediction of syllable duration, speech rate and tempo , 1992, ICSLP.
[127] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[128] Daniel Gildea,et al. Learning Bias and Phonological-Rule Induction , 1996, CL.