暂无分享,去创建一个
[1] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[2] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[3] Quoc V. Le,et al. Learning Longer-term Dependencies in RNNs with Auxiliary Losses , 2018, ICML.
[4] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Restarts , 2016, ArXiv.
[5] Kilian Q. Weinberger,et al. Snapshot Ensembles: Train 1, get M for free , 2017, ICLR.
[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[7] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[8] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[9] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[10] Richard Socher,et al. Quasi-Recurrent Neural Networks , 2016, ICLR.
[11] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[12] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.
[13] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[14] Max Jaderberg,et al. Understanding Synthetic Gradients and Decoupled Neural Interfaces , 2017, ICML.
[15] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[16] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.
[17] Bin Gu,et al. Training Neural Networks Using Features Replay , 2018, NeurIPS.