The 2015 KIT IWSLT speech-to-text systems for English and German
暂无分享,去创建一个
Alex Waibel | Matthias Sperber | Kevin Kilgour | Thai-Son Nguyen | A. Waibel | Matthias Sperber | Kevin Kilgour | T. Nguyen
[1] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[2] Jonathan G. Fiscus,et al. A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[3] Lalit R. Bahl,et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[5] Steve J. Young,et al. MMIE training of large vocabulary recognition systems , 1997, Speech Communication.
[6] Tony Robinson,et al. Scaling recurrent neural network language models , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[8] Finn Dag Buø,et al. JANUS 93: towards spontaneous speech translation , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Alexander H. Waibel,et al. Warped Minimum Variance Distortionless Response based bottle neck features for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[10] Klaus Ries,et al. The Karlsruhe-Verbmobil speech recognition engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[11] Wen Wang,et al. Techniques for effective vocabulary selection , 2003, INTERSPEECH.
[12] Mattias Heldner,et al. The fundamental frequency variation spectrum , 2008 .
[13] Marc Schröder,et al. The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..
[14] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..
[15] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[16] A. Waibel,et al. The 2014 KIT IWSLT speech-to-text systems for English, German and Italian , 2014, IWSLT.
[17] Florian Metze,et al. Extracting deep bottleneck features using stacked auto-encoders , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[18] Paul Deléglise,et al. Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks , 2014, LREC.
[19] A. Waibel,et al. A one-pass decoder based on polymorphic linguistic context assignment , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..
[20] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.
[21] Paul Taylor,et al. Festival Speech Synthesis System , 1998 .
[22] Matthias Sperber,et al. The 2013 KIT IWSLT speech-to-text systems for German and English , 2013, IWSLT.
[23] Matthias Sperber,et al. Improved Speaker Adaptation by Combining I-vector and fMLLR with Deep Bottleneck Networks , 2017, SPECOM.
[24] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.
[25] Khe Chai Sim,et al. An investigation of augmenting speaker representations to improve speaker normalisation for DNN-based speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.
[27] Andreas Stolcke,et al. Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..
[28] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[29] Sebastian Stüker,et al. Segmentation of Telephone Speech Based on Speech and Non-speech Models , 2013, SPECOM.
[30] Tomoki Toda,et al. The KIT-NAIST (contrastive) English ASR system for IWSLT 2012 , 2012, IWSLT.
[31] Marcello Federico,et al. Report on the 10th IWSLT evaluation campaign , 2013, IWSLT.
[32] P. Fränti,et al. Iterative split-and-merge algorithm for VQ codebook generation , 1998 .
[33] Brian Kingsbury,et al. Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[34] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[35] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[36] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.
[37] Florian Metze,et al. Models of tone for tonal and non-tonal languages , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.