Character-level language modeling with hierarchical recurrent neural networks
暂无分享,去创建一个
[1] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[2] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.
[3] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[4] P J Webros. BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .
[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[6] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[7] Tie-Yan Liu,et al. LightRNN: Memory and Computation-Efficient Recurrent Neural Networks , 2016, NIPS.
[8] John Cocke,et al. A Statistical Approach to Machine Translation , 1990, CL.
[9] Jürgen Schmidhuber,et al. Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..
[10] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[11] Pradeep Dubey,et al. BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies , 2015, ICLR.
[12] Vysoké Učení,et al. Statistical Language Models Based on Neural Networks , 2012 .
[13] Wang Ling,et al. Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.
[14] Benjamin Schrauwen,et al. Training and Analysing Deep Recurrent Neural Networks , 2013, NIPS.
[15] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[16] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[17] John Scott Bridle,et al. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.
[18] Alex Graves,et al. Connectionist Temporal Classification , 2012 .
[19] Joris Pelemans,et al. Sparse non-negative matrix language modeling for skip-grams , 2015, INTERSPEECH.
[20] Geoffrey Zweig,et al. Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[21] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[22] Wonyong Sung,et al. Single stream parallelization of generalized LSTM-like RNNs on a GPU , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.
[24] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[25] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[26] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[27] Y. Nesterov. A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .
[28] Wang Ling,et al. Character-based Neural Machine Translation , 2015, ArXiv.
[29] Kyunghyun Cho,et al. Gated Word-Character Recurrent Language Model , 2016, EMNLP.
[30] Wonyong Sung,et al. Character-level incremental speech recognition with recurrent neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[32] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[33] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.