Linguistically-motivated sub-word modeling with applications to speech recognition
暂无分享,去创建一个
[1] Mari Ostendorf,et al. Moving beyond the 'beads-on-a-string' model of speech , 1999 .
[2] A. Asadi,et al. Automatic detection and modeling of new words in a large-vocabulary continuous speech recognition system , 1992 .
[3] James R. Glass,et al. Learning units for domain-independent out-of- vocabulary word modelling , 2001, INTERSPEECH.
[4] Murat Saraclar,et al. Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[5] Mona Singh,et al. Experiments in spoken queries for document retrieval , 1997, EUROSPEECH.
[6] Yu Shi,et al. A system for spoken query information retrieval on mobile devices , 2002, IEEE Trans. Speech Audio Process..
[7] Dong Yu,et al. An introduction to voice search , 2008, IEEE Signal Processing Magazine.
[8] I. Lee Hetherington,et al. An efficient implementation of phonological rules using finite-state transducers , 2001, INTERSPEECH.
[9] Victor Zue,et al. JUPlTER: a telephone-based conversational interface for weather information , 2000, IEEE Trans. Speech Audio Process..
[10] Josef G. Bauer,et al. Accurate recognition of city names with spelling as a fall back strategy , 1999, EUROSPEECH.
[11] I. Lee Hetherington. A characterization of the problem of new, out-of-vocabulary words in continuous-speech recognition and understanding , 1995 .
[12] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[13] Stephanie Seneff,et al. Developing City Name Acquisition Strategies in Spoken Dialogue Systems Via User Simulation , 2005, SIGDIAL.
[14] Grace Chung. Automatically incorporating unknown words in JUPITER , 2000, INTERSPEECH.
[15] Günther Ruske,et al. Lexical out-of-vocabulary models for one-stage speech interpretation , 2005, INTERSPEECH.
[16] Robert I. Damper,et al. A multistrategy approach to improving pronunciation by analogy , 2000, CL.
[17] James Glass,et al. A Multimodal Home Entertainment Interface via a Mobile Device , 2008, ACL 2008.
[18] T. J. Watson. IMPROVEMENTS IN ENGLISH ASR FOR THE MALACH PROJECT USING SYLLABLE-CENTRIC MODELS , 2003 .
[19] John Makhoul,et al. BYBLOS: The BBN continuous speech recognition system , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[20] Yan Han,et al. Trajectory Clustering of Syllable-Length Acoustic Models for Continuous Speech Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[21] Richard M. Schwartz,et al. Automatic Detection Of New Words In A Large Vocabulary Continuous Speech Recognition System , 1989, HLT.
[22] Eugene Charniak,et al. Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.
[23] James R. Glass,et al. Segmentation and modeling in segment-based recognition , 1997, EUROSPEECH.
[24] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[25] P. J. Price,et al. Evaluation of Spoken Language Systems: the ATIS Domain , 1990, HLT.
[26] P. Ladefoged. A course in phonetics , 1975 .
[27] Joseph Polifroni,et al. Integrating recognition confidence scoring with language understanding and dialogue modeling , 2000, INTERSPEECH.
[28] Samuel Jay Keyser,et al. CV Phonology: A Generative Theory of the Syllable , 1988 .
[29] Jan Svartvik,et al. The London-Lund corpus of spoken english , 1990 .
[30] Michael Picheny,et al. Automatic phonetic baseform determination , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[31] J. Makhoul,et al. Automatic modeling for adding new words to a large-vocabulary continuous speech recognition system , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[32] Sheryl R. Young,et al. Recognition Confidence Measures: Detection of Misrecognitions and Out- Of-Vocabulary Words , 1994 .
[33] K. Maekawa. CORPUS OF SPONTANEOUS JAPANESE : ITS DESIGN AND EVALUATION , 2003 .
[34] James R. Glass,et al. Unsupervised Word Acquisition from Speech using Pattern Discovery , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[35] Steven Greenberg,et al. Incorporating information from syllable-length time scales into automatic speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[36] O. Fujimura,et al. Syllable as a unit of speech recognition , 1975 .
[37] Grace Yuet-Chee Chung. Towards multi-domain speech understanding with flexible and dynamic vocabulary , 2001 .
[38] Ken-ichi Iso,et al. Speech-activated text retrieval system for multimodal cellular phones , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[39] Jonathan G. Fiscus,et al. NIST Rich Transcription 2002 Evaluation: A Preview , 2002, LREC.
[40] Edward Filisko,et al. Developing attribute acquisition strategies in spoken dialogue systems via user simulation , 2006 .
[41] Victor Zue,et al. Phonological parsing for reversible letter-to-sound/sound-to-letter generation , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[42] Joseph Picone,et al. Syllable-based large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..
[43] Timothy J. Hazen,et al. Recognition Confidence Scoring for Use in Speech Understanding Systems , 2000 .
[44] Benoît Maison,et al. Automatic baseform generation from acoustic data , 2003, INTERSPEECH.
[45] P. Kiparsky. From cyclic phonology to lexical phonology , 1982 .
[46] Frédéric Bimbot,et al. Inference of variable-length acoustic units for continuous speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[47] James R. Glass,et al. A probabilistic framework for feature-based speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[48] James Glass,et al. The SUMMIT speech recognition system: phonological modelling and lexical access , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[49] Steven Greenberg,et al. The modulation spectrogram: in pursuit of an invariant representation of speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[50] Otis Gospodnetic,et al. Lucene in Action , 2004 .
[51] Richard M. Schwartz,et al. Analysis of the errors produced by the 2004 BBN speech recognition system in the DARPA EARS evaluations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[52] Fabio Crestani,et al. Effects of word recognition errors in spoken query processing , 2000, Proceedings IEEE Advances in Digital Libraries 2000.
[53] James R. Glass,et al. New word acquisition using subword modeling , 2007, INTERSPEECH.
[54] Grace Chung. A three-stage solution for flexible vocabulary speech understanding , 2000, INTERSPEECH.
[55] Valentín Cardeñoso-Payo,et al. A system for speech driven information retrieval , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[56] Jane W. Chang,et al. Near-miss modeling: a segment-based approach to speech recognition , 1998 .
[57] Frederick Jelinek,et al. Classifying words for improved statistical language models , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[58] Alfred Hauenstein. Using syllables in a hybrid HMM-ANN recognition system , 1997, EUROSPEECH.
[59] Thomas Schaaf. Detection of OOV words using generalized word models and a semantic class language model , 2001, INTERSPEECH.
[60] Hideaki Kikuchi,et al. Corpus of Spontaneous Japanese : Design , Annotation and XML Representation , 2004 .
[61] Philip C. Woodland,et al. Particle-based language modelling , 2000, INTERSPEECH.
[62] Yifan Gong,et al. Speech-enabled information retrieval in the automobile environment , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[63] Biing-Hwang Juang,et al. Spoken Query Processing for Information Retrieval , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[64] Bhiksha Raj,et al. Spokenquery: an alternate approach to chosing items with speech , 2004, INTERSPEECH.
[65] Lucian Galescu. Recognition of out-of-vocabulary words with sub-lexical language models , 2003, INTERSPEECH.
[66] Helen Meng,et al. The Use of Distinctive Features for Automatic Speech Recognition , 1991 .
[67] Sidney Greenbaum,et al. Comparing English worldwide : the International Corpus of English , 1996 .
[68] Hong C. Leung,et al. New-word addition and adaptation in a stochastic explicit-segment speech recognition system , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[69] Richard M. Schwartz,et al. The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition system , 2005, INTERSPEECH.
[70] Ronald Rosenfeld,et al. Optimizing lexical and N-gram coverage via judicious use of linguistic data , 1995, EUROSPEECH.
[71] Alan W. Black,et al. Issues in building general letter to sound rules , 1998, SSW.
[72] Stephanie Seneff,et al. Response planning and generation in the MERCURY flight reservation system , 2002, Comput. Speech Lang..
[73] Giuseppe Riccardi,et al. How may I help you? , 1997, Speech Commun..
[74] Monika Woszczyna,et al. Detection and transcription of new words , 1993, EUROSPEECH.
[75] Fil Alleva,et al. Automatic New Word Acquisition: Spelling from Acoustics , 1989, HLT.
[76] Victor Zue,et al. The MIT SUMMIT Speech Recognition System: A Progress Report , 1989, HLT.
[77] Geoffrey Zweig,et al. Advances in speech transcription at IBM under the DARPA EARS program , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[78] Bhiksha Raj,et al. The MERL SpokenQuery information retrieval system a system for retrieving pertinent documents from a spoken query , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.
[79] Mitch Weintraub,et al. Automatic Learning of Word Pronunciation from Data , 1996 .
[80] Stephanie Strassel. Linguistic Resources for Effective, Affordable, Reusable Speech-to-Text , 2004, LREC.
[81] Ute Ehrlich,et al. How to access audio files of large data bases using in-car speech dialogue systems , 2007, INTERSPEECH.
[82] Stephanie Seneff,et al. Phonological Parsing for Bi-directional Letter-to-Sound/Sound-to-Letter Generation , 1994, HLT.
[83] Hermann Ney,et al. Investigations on joint-multigram models for grapheme-to-phoneme conversion , 2002, INTERSPEECH.
[84] Frank K. Soong,et al. A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition , 1990, HLT.
[85] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[86] James R. Glass,et al. Heterogeneous lexical units for automatic speech recognition: preliminary investigations , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[87] James R. Glass. A probabilistic framework for segment-based speech recognition , 2003, Comput. Speech Lang..
[88] Hui Lin,et al. OOV detection by joint word/phone lattice alignment , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[89] Jeff A. Bilmes,et al. Use of syllable nuclei locations to improve ASR , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[90] Patti Price,et al. The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[91] W. Francis,et al. The London-Lund Corpus of Spoken English: Description and Research , 1992 .
[92] Sheryl R. Young,et al. Detecting misrecognitions and out-of-vocabulary words , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[93] Alexander H. Waibel,et al. Dictionary learning for spontaneous speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[94] Robert L. Mercer,et al. An information theoretic approach to the automatic determination of phonemic baseforms , 1984, ICASSP.
[95] Lalit R. Bahl,et al. A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[96] Stephanie Seneff. Reversible Sound-to-Letter/Letter-to-Sound Modeling Based on Syllable Structure , 2007, HLT-NAACL.
[97] Paul Lamere,et al. Design of the CMU Sphinx-4 Decoder , 2022 .
[98] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[99] Stephanie Seneff,et al. CONTEXT-SENSITIVE LANGUAGE MODELING FOR LARGE SETS OF PROPER NOUNS IN MULTIMODAL DIALOGUE SYSTEMS , 2006, 2006 IEEE Spoken Language Technology Workshop.
[100] Kazuyo Tanaka,et al. Detection of unknown words in large vocabulary speech recognition , 1993, EUROSPEECH.
[101] Frédéric Bimbot,et al. Variable-length sequence matching for phonetic transcription using joint multigrams , 1995, EUROSPEECH.
[102] MarchandYannick,et al. A multistrategy approach to improving pronunciation by analogy , 2000 .
[103] Hermann Ney,et al. Open vocabulary speech recognition with flat hybrid models , 2005, INTERSPEECH.
[104] James R. Glass,et al. Unsupervised Pattern Discovery in Speech , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[105] James R. Glass. Finding acoustic regularities in speech: applications to phonetic recognition , 1988 .
[106] Stanley F. Chen,et al. Conditional and joint models for grapheme-to-phoneme conversion , 2003, INTERSPEECH.
[107] Richard Lippmann,et al. Speech recognition by machines and humans , 1997, Speech Commun..
[108] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.
[109] Georges Linarès,et al. On-demand new word learning using world wide web , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[110] Nils J. Nilsson,et al. Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[111] Richard M. Schwartz,et al. A scalable architecture for Directory Assistance automation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[112] Min Tang,et al. Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generation , 2004, INTERSPEECH.
[113] Stephanie Seneff,et al. TINA: A Natural Language System for Spoken Language Applications , 1992, Comput. Linguistics.
[114] S. Rieck,et al. Acoustic modelling of subword units in the Isadora speech recognizer , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[115] James F. Allen,et al. Pronunciation of proper names with a joint n-gram model for bi-directional grapheme-to-phoneme conversion , 2002, INTERSPEECH.
[116] Xuedong Huang,et al. Improvements on a trainable letter-to-sound converter , 1997, EUROSPEECH.
[117] Joseph Polifroni,et al. Recognition confidence scoring and its use in speech understanding systems , 2002, Comput. Speech Lang..
[118] Hong C. Leung,et al. PhoneBook: a phonetically-rich isolated-word telephone-speech database , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[119] Andreas Stolcke,et al. Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..
[120] James R. Glass,et al. Real-time probabilistic segmentation for segment-based speech recognition , 1998, ICSLP.
[121] I. Lee Hetherington. The MIT finite-state transducer toolkit for speech and language processing , 2004, INTERSPEECH.
[122] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .
[123] David G. Stork,et al. Pattern Classification , 1973 .
[124] Victor Zue,et al. The VOYAGER Speech Understanding System: A Progress Report , 1989, HLT.
[125] Stephanie Seneff,et al. Two-pass strategy for handling OOVs in a large vocabulary recognition task , 2005, INTERSPEECH.
[126] Victor Zue,et al. Language modelling for recognition and understanding using layered bigrams , 1992, ICSLP.
[127] Dietrich Klakow,et al. Speech recognition for huge vocabularies by using optimized sub-word units , 2001, INTERSPEECH.
[128] Ronald A. Cole,et al. Speech recognition using syllable-like units , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[129] Mehryar Mohri,et al. Finite-State Transducers in Language and Speech Processing , 1997, CL.
[130] James F. Allen,et al. Bi-directional conversion between graphemes and phonemes using a joint N-gram model , 2001, SSW.
[131] Noam Chomsky,et al. The Sound Pattern of English , 1968 .
[132] Thilo Pfau,et al. Creating large subword units for speech recognition , 1997, EUROSPEECH.
[133] James R. Glass,et al. Segment-based recognition on the phonebook task: initial results and observations on duration modeling , 2001, INTERSPEECH.
[134] Sadaoki Furui,et al. Why Is the Recognition of Spontaneous Speech so Hard? , 2005, TSD.
[135] Timothy J. Hazen,et al. A comparison and combination of methods for OOV word detection and word confidence scoring , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[136] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[137] Kenneth Ward Church. Phrase-structure parsing: a method for taking advantage of allophonic constraints , 1983 .
[138] Rhys James Jones,et al. Continuous speech recognition using syllables , 1997, EUROSPEECH.
[139] Mark Huckvale,et al. Out-of-vocabulary rate reduction through dispersion-based lexicon acquisition , 2000 .
[140] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.
[141] L. Zhang,et al. Speech recognition using syllable and pseudo articulatory features modeling , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.
[142] Christian-Michael Westendorf,et al. Learning pronunciation dictionary from speech data , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[143] H. Kucera,et al. Computational analysis of present-day American English , 1967 .
[144] Walter Daelemans,et al. Transcription of out-of-vocabulary words in large vocabulary speech recognition based on phoneme-to-grapheme conversion , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[145] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..
[146] Dietrich Klakow,et al. OOV-detection in large vocabulary system using automatically defined word-fragments as fillers , 1999, EUROSPEECH.
[147] Mark A. Randolph,et al. Syllable-based constraints on properties of English sounds , 1989 .
[148] Steven Greenberg,et al. Performance improvements through combining phone- and syllable-scale information in automatic speech recognition , 1998, ICSLP.
[149] James R. Glass,et al. A multi-class approach for modelling out-of-vocabulary words , 2002, INTERSPEECH.
[150] James Glass,et al. Modelling out-of-vocabulary words for robust speech recognition , 2002 .
[151] William I. Hallahan. DECtalk Software: Text-to-Speech Technology and Implementation , 1995, Digit. Tech. J..
[152] Sherif Abdou,et al. The BBN RT04 English broadcast news transcription system , 2005, INTERSPEECH.
[153] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[154] James Glass,et al. Multi-level acoustic segmentation of continuous speech , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[155] A. Glavieux,et al. Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.
[156] Hsiao-Wuen Hon,et al. An overview of the SPHINX speech recognition system , 1990, IEEE Trans. Acoust. Speech Signal Process..
[157] Hauke Schramm,et al. Strategies for name recognition in automatic directory assistance systems , 2000, Speech Commun..
[158] Stephanie Seneff,et al. ANGIE: a new framework for speech analysis based on morpho-phonological modelling , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.