Multi-Dialect Speech Recognition with a Single Sequence-to-Sequence Model
暂无分享,去创建一个
Tara N. Sainath | Patrick Nguyen | Zhifeng Chen | Khe Chai Sim | Michiel Bacchiani | Bo Li | Yanghui Wu | Eugene Weinstein | Kanishka Rao | Z. Chen | Kanishka Rao | Bo Li | M. Bacchiani | K. Sim | Eugene Weinstein | Yan-Qing Wu | Patrick Nguyen
[1] Georg Heigold,et al. Multilingual acoustic models using distributed deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[2] Hasim Sak,et al. Multi-accent speech recognition with hierarchical grapheme based models , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Yifan Gong,et al. Multi-accent deep neural network acoustic model with accent-specific top layer using the KLD-regularized model adaptation , 2014, INTERSPEECH.
[4] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[5] Steve Renals,et al. Multilingual training of deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[6] John R. Hershey,et al. Language independent end-to-end architecture for joint language identification and speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[7] Hui Lin,et al. A study on multilingual acoustic modeling for large vocabulary ASR , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] Jasha Droppo,et al. Multi-task learning in deep neural networks for improved phoneme recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[10] Vassilios Diakoloukas,et al. Development of dialect-specific speech recognizers using adaptation methods , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[11] Liang Lu,et al. On training the recurrent neural network encoder-decoder for large vocabulary end-to-end speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Kai Yu,et al. Cluster Adaptive Training for Deep Neural Network Based Acoustic Model , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[14] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[15] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[16] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[17] Ngoc Thang Vu,et al. Multilingual deep neural network based acoustic modeling for rapid language adaptation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Hui Lin,et al. Learning Methods in Multilingual Speech Recognition , 2008, NIPS 2008.
[19] Hermann Ney,et al. Multilingual acoustic modeling using graphemes , 2003, INTERSPEECH.
[20] Lei Xie,et al. Attention-Based End-to-End Speech Recognition on Voice Search , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[22] Yu Zhang,et al. Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Fadi Biadsy,et al. Google's cross-dialect Arabic voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[25] Hui Jiang,et al. Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[26] Tara N. Sainath,et al. Lower Frame Rate Neural Network Acoustic Models , 2016, INTERSPEECH.
[27] William J. Byrne,et al. Towards language independent acoustic modeling , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[28] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[29] Pedro J. Moreno,et al. Towards acoustic model unification across dialects , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[30] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.
[31] Dirk Van Compernolle. Recognizing speech of goats, wolves, sheep and ... non-natives , 2001, Speech Commun..
[32] Hynek Hermansky,et al. Cross-lingual and multi-stream posterior features for low resource LVCSR systems , 2010, INTERSPEECH.
[33] Tanja Schultz,et al. Towards universal speech recognition , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.
[34] Pedro J. Moreno,et al. Multi-Dialectical Languages Effect on Speech Recognition: Too Much Choice Can Hurt , 2015, ICNLSP.