Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition
暂无分享,去创建一个
Zhiyuan Xu | Jianwei Yu | Xunying Liu | Helen M. Meng | Xie Chen | Shansong Liu | Jinze Sha | Xie Chen | H. Meng | Xunying Liu | Shansong Liu | Jinze Sha | J. Yu | Zhiyuan Xu
[1] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[2] Roger Fletcher,et al. Practical methods of optimization; (2nd ed.) , 1987 .
[3] L. Bottou. Stochastic Gradient Learning in Neural Networks , 1991 .
[4] Nicholas I. M. Gould,et al. Convergence of quasi-Newton matrices generated by the symmetric rank one update , 1991, Math. Program..
[5] Jorge Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..
[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[7] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[8] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[9] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[10] Holger Schwenk,et al. Continuous space language models , 2007, Comput. Speech Lang..
[11] Ahmad Emami,et al. Empirical study of neural network language models for Arabic speech recognition , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[12] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[13] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[14] Mark J. F. Gales,et al. Improved neural network based language modelling and adaptation , 2010, INTERSPEECH.
[15] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[17] Tara N. Sainath,et al. Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization , 2012, INTERSPEECH.
[18] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.
[19] Kenneth Ward Church,et al. Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model , 2012, Speech Commun..
[20] Alexandre Allauzen,et al. Structured Output Layer Neural Network Language Models for Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[22] Yongqiang Wang,et al. Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch , 2014, INTERSPEECH.
[23] Geoffrey Zweig,et al. Cache based recurrent neural network language model inference for first pass speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Yongqiang Wang,et al. Efficient lattice rescoring using recurrent neural network language models , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Hermann Ney,et al. From Feedforward to Recurrent LSTM Neural Networks for Language Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Mark J. F. Gales,et al. CUED-RNNLM — An open-source toolkit for efficient training and evaluation of recurrent neural network language models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Yongqiang Wang,et al. Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Yongqiang Wang,et al. Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[29] Tara N. Sainath,et al. Parallel Deep Neural Network Training for Big Data on Blue Gene/Q , 2017, IEEE Transactions on Parallel and Distributed Systems.