Lessons from Building Acoustic Models with a Million Hours of Speech
暂无分享,去创建一个
[1] Hank Liao,et al. Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[2] Sree Hari Krishnan Parthasarathi,et al. fMLLR based feature-space speaker adaptation of DNN acoustic models , 2015, INTERSPEECH.
[3] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.
[4] Mark Hasegawa-Johnson,et al. Semi-supervised training of Gaussian mixture models by conditional entropy minimization , 2010, INTERSPEECH.
[5] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .
[6] Yifan Gong,et al. Large-Scale Domain Adaptation via Teacher-Student Learning , 2017, INTERSPEECH.
[7] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[8] Kenneth Ward Church,et al. Deep neural network features and semi-supervised training for low resource speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[10] Lei Xie,et al. Empirical Evaluation of Parallel Training Algorithms on Acoustic Modeling , 2017, INTERSPEECH.
[11] Qiang Huo,et al. Scalable training of deep learning machines by incremental block training with intra-block parallel optimization and blockwise model-update filtering , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[13] James H. Martin,et al. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .
[14] Naoyuki Kanda,et al. Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks , 2016, INTERSPEECH.
[15] Yifan Gong,et al. Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration , 2013, INTERSPEECH.
[16] David Miller,et al. The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text , 2004, LREC.
[17] Frederick Jelinek,et al. Some of my Best Friends are Linguists , 2005, Lang. Resour. Evaluation.
[18] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[19] Sree Hari Krishnan Parthasarathi,et al. Robust Speech Recognition via Anchor Word Representations , 2017, INTERSPEECH.
[20] Richard M. Schwartz,et al. Unsupervised Training on Large Amounts of Broadcast News Data , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[21] Nikko Strom,et al. Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.
[22] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[23] Alexander H. Waibel,et al. Unsupervised training of a speech recognizer: recent experiments , 1999, EUROSPEECH.
[24] Michelle Guo,et al. Knowledge distillation for small-footprint highway networks , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Herbert Gish,et al. Improved estimation, evaluation and applications of confidence measures for speech recognition , 1997, EUROSPEECH.
[26] Jean-Luc Gauvain,et al. Unsupervised acoustic model training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[27] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.
[28] Sanjeev Khudanpur,et al. Semi-supervised maximum mutual information training of deep neural network acoustic models , 2015, INTERSPEECH.
[29] Sree Hari Krishnan Parthasarathi,et al. Robust i-vector based adaptation of DNN acoustic model for speech recognition , 2015, INTERSPEECH.
[30] Tara N. Sainath,et al. Lower Frame Rate Neural Network Acoustic Models , 2016, INTERSPEECH.
[31] Sankaran Panchapagesan,et al. Model Compression Applied to Small-Footprint Keyword Spotting , 2016, INTERSPEECH.
[32] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[33] Yongqiang Wang,et al. Semi-Supervised Training in Deep Learning Acoustic Model , 2016, INTERSPEECH.
[34] Fernando Pereira,et al. Distributed acoustic modeling with back-off n-grams , 2012, ICASSP.
[35] Hermann Ney,et al. Fast and Robust Training of Recurrent Neural Networks for Offline Handwriting Recognition , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.
[36] Roger K. Moore. A comparison of the data requirements of automatic speech recognition systems and human listeners , 2003, INTERSPEECH.
[37] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[38] Jean-Luc Gauvain,et al. Lightly supervised and unsupervised acoustic model training , 2002, Comput. Speech Lang..
[39] Mark Hasegawa-Johnson,et al. Maximum mutual information estimation with unlabeled data for phonetic classification , 2008, INTERSPEECH.
[40] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.