暂无分享,去创建一个
[1] Geoffrey E. Hinton. Connectionist Learning Procedures , 1989, Artif. Intell..
[2] Peter Földiák,et al. Learning Invariance from Transformation Sequences , 1991, Neural Comput..
[3] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[4] Chris Callison-Burch,et al. Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .
[5] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.
[6] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[7] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[8] Geoffrey Zweig,et al. Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[9] Jürgen Schmidhuber,et al. Low Complexity Proto-Value Function Learning from Sensory Observations with Incremental Slow Feature Analysis , 2012, ICANN.
[10] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[11] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[12] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[13] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[14] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] George Saon,et al. A nonmonotone learning rate strategy for SGD training of deep neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Oliver Brock,et al. Learning state representations with robotic priors , 2015, Auton. Robots.
[18] Georgios Piliouras,et al. Gradient Descent Converges to Minimizers: The Case of Non-Isolated Critical Points , 2016, ArXiv.
[19] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[20] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[21] Les E. Atlas,et al. Full-Capacity Unitary Recurrent Neural Networks , 2016, NIPS.
[22] Muhammad Ghifary,et al. Strongly-Typed Recurrent Neural Networks , 2016, ICML.
[23] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[24] Erhardt Barth,et al. Recurrent Dropout without Memory Loss , 2016, COLING.
[25] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.
[26] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[27] Yann LeCun,et al. Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs , 2016, ICML.
[28] Nathan Srebro,et al. The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.
[29] Nicolas Usunier,et al. Improving Neural Language Models with a Continuous Cache , 2016, ICLR.
[30] Richard Socher,et al. Quasi-Recurrent Neural Networks , 2016, ICLR.
[31] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[32] Ali Farhadi,et al. Query-Reduction Networks for Question Answering , 2016, ICLR.
[33] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[34] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[35] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.
[36] Aaron C. Courville,et al. Recurrent Batch Normalization , 2016, ICLR.
[37] Yoshua Bengio,et al. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.
[38] Yann Ollivier,et al. Unbiasing Truncated Backpropagation Through Time , 2017, ArXiv.
[39] David M. Blei,et al. Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..
[40] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[41] Richard Socher,et al. Revisiting Activation Regularization for Language RNNs , 2017, ArXiv.
[42] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[43] Chris Dyer,et al. On the State of the Art of Evaluation in Neural Language Models , 2017, ICLR.