Investigation of Modeling Units for Mandarin Speech Recognition Using Dfsmn-ctc-smbr
暂无分享,去创建一个
Shiliang Zhang | Ming Lei | Wei Li | Yuan Liu | Shiliang Zhang | Ming Lei | Yuan Liu | Wei Li
[1] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[2] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Shiliang Zhang,et al. Deep-FSMN for Large Vocabulary Continuous Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Tara N. Sainath,et al. Lower Frame Rate Neural Network Acoustic Models , 2016, INTERSPEECH.
[5] Qiang Huo,et al. Scalable training of deep learning machines by incremental block training with intra-block parallel optimization and blockwise model-update filtering , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Brian Kingsbury,et al. Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[7] Jie Li,et al. Towards end-to-end speech recognition for Chinese Mandarin using long short-term memory recurrent neural networks , 2015, INTERSPEECH.
[8] Yu Hu,et al. Nonrecurrent Neural Structure for Long-Term Dependence , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Yifan Gong,et al. Advancing Acoustic-to-Word CTC Model , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[11] Xiangang Li,et al. A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition , 2013, Neurocomputing.
[12] Johan Schalkwyk,et al. Learning acoustic frame labeling for speech recognition with recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Andrew W. Senior,et al. Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition , 2014, ArXiv.
[14] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[15] Lukás Burget,et al. Sequence-discriminative training of deep neural networks , 2013, INTERSPEECH.
[16] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[17] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[18] Bo Xu,et al. Acoustic modeling for Chinese speech recognition: a comparative study of Mandarin and Cantonese , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[19] Hao Wu,et al. Context dependent syllable acoustic model for continuous Chinese speech recognition , 2007, INTERSPEECH.
[20] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[21] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[22] Yu Hu,et al. Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency , 2015, ArXiv.
[23] Shiliang Zhang,et al. Acoustic Modeling with DFSMN-CTC and Joint CTC-CE Learning , 2018, INTERSPEECH.
[24] Yu Zhang,et al. Highway long short-term memory RNNS for distant speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.