Learning Simpler Language Models with the Differential State Framework
暂无分享,去创建一个
[1] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[2] Michael I. Jordan. Attractor dynamics and parallelism in a connectionist sequential machine , 1990 .
[3] C. L. Giles,et al. Second-order recurrent neural networks for grammatical inference , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[4] Colin Giles,et al. Learning Context-free Grammars: Capabilities and Limitations of a Recurrent Neural Network with an External Stack Memory (cid:3) , 1992 .
[5] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[6] C. Lee Giles,et al. Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks , 1992, Neural Computation.
[7] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[8] Srimat T. Chakradhar,et al. First-order versus second-order single-layer recurrent neural networks , 1994, IEEE Trans. Neural Networks.
[9] A. Roli. Artificial Neural Networks , 2012, Lecture Notes in Computer Science.
[10] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[12] C. Lee Giles,et al. The Neural Network Pushdown Automaton: Architecture, Dynamics and Training , 1997, Summer School on Neural Networks.
[13] E. Newport,et al. Computation of Conditional Probability Statistics by 8-Month-Old Infants , 1998 .
[14] Jürgen Schmidhuber,et al. Recurrent nets that time and count , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[15] John Hale,et al. A Probabilistic Earley Parser as a Psycholinguistic Model , 2001, NAACL.
[16] Ah Chung Tsoi,et al. Noisy Time Series Prediction using Recurrent Neural Networks and Grammatical Inference , 2001, Machine Learning.
[17] Michael C. Mozer,et al. Neural net architectures for temporal sequence processing , 2007 .
[18] R. Levy. Expectation-based syntactic comprehension , 2008, Cognition.
[19] Reinhold Kliegl,et al. Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus , 2008, Journal of Eye Movement Research.
[20] Yoshua Bengio,et al. Quadratic Features and Deep Architectures for Chunking , 2009, NAACL.
[21] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[22] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Ilya Sutskever,et al. SUBWORD LANGUAGE MODELING WITH NEURAL NETWORKS , 2011 .
[24] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.
[25] Vysoké Učení,et al. Statistical Language Models Based on Neural Networks , 2012 .
[26] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[27] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[28] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[29] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.
[30] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[31] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[32] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[33] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.
[34] Wojciech Zaremba,et al. An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.
[35] Kyunghyun Cho,et al. Larger-Context Language Modelling , 2015, ArXiv.
[36] Jason Weston,et al. Memory Networks , 2014, ICLR.
[37] Marc'Aurelio Ranzato,et al. Learning Longer Memory in Recurrent Neural Networks , 2014, ICLR.
[38] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[39] Yoshua Bengio,et al. Gated Feedback Recurrent Neural Networks , 2015, ICML.
[40] Tomas Mikolov,et al. Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.
[41] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[42] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[43] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[44] Ying Zhang,et al. On Multiplicative Integration with Recurrent Neural Networks , 2016, NIPS.
[45] Martin Sundermeyer,et al. Improvements in language and translation modeling , 2016 .
[46] Misha Denil,et al. Noisy Activation Functions , 2016, ICML.
[47] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[48] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[49] Joelle Pineau,et al. Multi-modal Variational Encoder-Decoders , 2016, ArXiv.
[50] Jianxin Wu,et al. Minimal gated unit for recurrent neural networks , 2016, International Journal of Automation and Computing.
[51] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[52] Yoshua Bengio,et al. Memory Augmented Neural Networks with Wormhole Connections , 2017, ArXiv.
[53] Quoc V. Le,et al. HyperNetworks , 2016, ICLR.
[54] Yoshua Bengio,et al. Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.
[55] Tomas Mikolov,et al. Variable Computation in Recurrent Neural Networks , 2016, ICLR.
[56] Aaron C. Courville,et al. Recurrent Batch Normalization , 2016, ICLR.
[57] Yoshua Bengio,et al. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.
[58] Joelle Pineau,et al. Piecewise Latent Variables for Neural Variational Text Processing , 2016, EMNLP.