Recent innovations in speech-to-text transcription at SRI-ICSI-UW
暂无分享,去创建一个
Andreas Stolcke | Wen Wang | Mei-Yuh Hwang | Arindam Mandal | Dimitra Vergyri | Mari Ostendorf | Jing Zheng | Martin Graciarena | Horacio Franco | Nelson Morgan | Tim Ng | Katrin Kirchhoff | Xin Lei | M. Kemal Sönmez | Qifeng Zhu | Venkata Ramana Rao Gadde | Anand Venkataraman | Barry Y. Chen | Barry Y. Chen | N. Morgan | H. Franco | Mari Ostendorf | A. Stolcke | Q. Zhu | M. Hwang | M. Graciarena | M. Sönmez | D. Vergyri | Arindam Mandal | Weiqi Wang | Jing Zheng | Katrin Kirchhoff | V. R. Gadde | X. Lei | Tim Ng | A. Venkataraman | Wen Wang | K. Kirchhoff
[1] B. Atal. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.
[2] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .
[3] Lalit R. Bahl,et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4] Hiroshi Maruyama,et al. Structural Disambiguation With Constraint Propagation , 1990, ACL.
[5] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[6] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .
[7] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..
[8] P. Woodland,et al. Flexible speaker adaptation using maximum likelihood linear regression , 1995 .
[9] Steve Young,et al. Large vocabulary speech recognition , 1995 .
[10] P.C. Woodland,et al. The 1994 HTK large vocabulary speech recognition system , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[11] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[12] Mark J. F. Gales,et al. The generation and use of regression class trees for MLLR adaptation , 1996 .
[13] S. Wegmann,et al. Speaker normalization on conversational telephone speech , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[14] Michael Picheny,et al. New methods in continuous Mandarin speech recognition , 1997, EUROSPEECH.
[15] Larry P. Heck,et al. A lognormal tied mixture model of pitch for prosody based speaker recognition , 1997, EUROSPEECH.
[16] Jean-Luc Gauvain,et al. Transcribing Broadcast News: The LIMSI Nov96 Hub4 System , 1997 .
[17] Andreas G. Andreou,et al. Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition , 1997 .
[18] Mehryar Mohri,et al. Finite-State Transducers in Language and Speech Processing , 1997, CL.
[19] Francis Kubala,et al. Fast Robust Inverse Transform SAT and Multi-stage Adaptation , 1998 .
[20] Fernando Pereira,et al. Efficient general lattice generation and rescoring , 1999, EUROSPEECH.
[21] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[22] Hynek Hermansky,et al. Temporal patterns (TRAPs) in ASR of noisy speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[23] Gunnar Evermann,et al. Posterior probability decoding, confidence estimation and system combination , 2000 .
[24] Andreas Stolcke,et al. Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..
[25] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[26] Daniel Povey,et al. Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..
[27] Andreas Stolcke,et al. Building an ASR system for noisy environments: SRI's 2001 SPINE evaluation system , 2002, INTERSPEECH.
[28] Kareem Darwish,et al. Building a Shallow Arabic Morphological Analyser in One Day , 2002, SEMITIC@ACL.
[29] Mary P. Harper,et al. The SuperARV Language Model: Investigating the Effectiveness of Tightly Integrating Multiple Knowledge Sources , 2002, EMNLP.
[30] Mark J. F. Gales. Maximum likelihood multiple subspace projections for hidden Markov models , 2002, IEEE Trans. Speech Audio Process..
[31] Andreas Stolcke,et al. Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures , 2003, NAACL.
[32] Andreas Stolcke,et al. Prosodic knowledge sources for automatic speech recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[33] Hervé Bourlard,et al. New entropy based combination rules in HMM/ANN multi-stream ASR , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[34] Jeff A. Bilmes,et al. Factored Language Models and Generalized Parallel Backoff , 2003, NAACL.
[35] Wen Wang,et al. Techniques for effective vocabulary selection , 2003, INTERSPEECH.
[36] The robustness of an almost-parsing language model given errorful training data , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[37] Mary P. Harper,et al. Statistical parsing and language modeling based on constraint dependency grammar , 2003 .
[38] Andreas Stolcke,et al. The use of a linguistically motivated language model in conversational speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[39] Kevin Duh,et al. Automatic Learning of Language Model Structure , 2004, COLING.
[40] Andreas Stolcke,et al. An efficient repair procedure for quick transcriptions , 2004, INTERSPEECH.
[41] Andreas Stolcke,et al. Voicing feature integration in SRI's decipher LVCSR system , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[42] Andreas Stolcke,et al. Morphology-based language modeling for arabic speech recognition , 2004, INTERSPEECH.
[43] K. Sonmez,et al. Multirate ASR models for phone-class dependent N-best list rescoring , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..
[44] Andreas Stolcke,et al. Leveraging speaker-dependent variation of adaptation , 2005, INTERSPEECH.
[45] Mei-Yuh Hwang,et al. Web-data augmented language models for Mandarin conversational speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[46] Andreas Stolcke,et al. Using MLP features in SRI's conversational speech recognition system , 2005, INTERSPEECH.
[47] Mark J. F. Gales,et al. Progress in the CU-HTK broadcast news transcription system , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[48] Andreas Stolcke,et al. Enriching speech recognition with automatic detection of sentence boundaries and disfluencies , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[49] Andreas Stolcke,et al. Porting Decipher from English to Mandarin , 2006 .