暂无分享,去创建一个
[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[2] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[3] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[4] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.
[5] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[7] Luca Bertinetto,et al. Learning feed-forward one-shot learners , 2016, NIPS.
[8] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[9] Alexandre Lacoste,et al. Bayesian Hypernetworks , 2017, ArXiv.
[10] J. Urgen Schmidhuber. Learning to Control Fast-weight Memories: an Alternative to Dynamic Recurrent Networks , 1991 .
[11] Jitendra Malik,et al. Learning to Optimize , 2016, ICLR.
[12] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[13] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.
[14] Diederik P. Kingma. Variational inference & deep learning: A new synthesis , 2017 .
[15] Anna Gambin,et al. Improvement of the k-nn Entropy Estimator with Applications in Systems Biology , 2016, Entropy.
[16] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[17] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[18] Deborah Silver,et al. Feature Visualization , 1994, Scientific Visualization.
[19] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.
[20] Luc Van Gool,et al. Dynamic Filter Networks , 2016, NIPS.
[21] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[22] Shakir Mohamed,et al. Variational Inference with Normalizing Flows , 2015, ICML.
[23] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[25] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[26] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[27] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[28] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[29] Joan Bruna,et al. Topology and Geometry of Half-Rectified Network Optimization , 2016, ICLR.
[30] Geoffrey E. Hinton,et al. Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.
[31] Fred A. Hamprecht,et al. Essentially No Barriers in Neural Network Energy Landscape , 2018, ICML.
[32] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[33] Misha Denil,et al. Predicting Parameters in Deep Learning , 2014 .
[34] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.
[35] Charles O. Marsh. Introduction to Continuous Entropy , 2013 .
[36] D. Ruppert,et al. Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .
[37] Max Welling,et al. Multiplicative Normalizing Flows for Variational Bayesian Neural Networks , 2017, ICML.
[38] Ming-Hsuan Yang,et al. Diversified Texture Synthesis with Feed-Forward Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Andrew Gordon Wilson,et al. Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs , 2018, NeurIPS.
[40] Kenneth O. Stanley,et al. A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.
[41] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[42] Hod Lipson,et al. Neural Network Quine , 2018, ALIFE.
[43] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[44] A. Kraskov,et al. Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.
[45] David Duvenaud,et al. Stochastic Hyperparameter Optimization through Hypernetworks , 2018, ArXiv.
[46] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.
[47] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[48] David M. Blei,et al. Stochastic Gradient Descent as Approximate Bayesian Inference , 2017, J. Mach. Learn. Res..
[49] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[50] Jürgen Schmidhuber,et al. A ‘Self-Referential’ Weight Matrix , 1993 .