Partitioned integrators for thermodynamic parameterization of neural networks
暂无分享,去创建一个
[1] Benedict Leimkuhler,et al. Hypocoercivity properties of adaptive Langevin dynamics , 2020, SIAM J. Appl. Math..
[2] Micah Goldblum,et al. Understanding Generalization through Visualizations , 2019, ICBINB@NeurIPS.
[3] J. Yosinski,et al. LCA: Loss Change Allocation for Neural Network Training , 2019, NeurIPS.
[4] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[5] Eric Moulines,et al. The promises and pitfalls of Stochastic Gradient Langevin Dynamics , 2018, NeurIPS.
[6] David P. Herzog. Exponential relaxation of the Nos\'e-Hoover equation under Brownian heating , 2018, 1804.05153.
[7] Yann Dauphin,et al. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks , 2017, ICLR.
[8] Vincent Danos,et al. Langevin Dynamics with Variable Coefficients and Nonconservative Forces: From Stationary States to Numerical Methods , 2017, Entropy.
[9] Yoshua Bengio,et al. Three Factors Influencing Minima in SGD , 2017, ArXiv.
[10] Andrew Y. Ng,et al. Improving palliative care with deep learning , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).
[11] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[12] Nathan Srebro,et al. The Marginal Value of Adaptive Gradient Methods in Machine Learning , 2017, NIPS.
[13] Levent Sagun,et al. Energy landscapes for machine learning. , 2017, Physical chemistry chemical physics : PCCP.
[14] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[15] Alexandre Tkatchenko,et al. Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.
[16] Daniel Jiwoong Im,et al. An empirical analysis of the optimization of deep network loss surfaces , 2016, 1612.04010.
[17] Yann LeCun,et al. Singularity of the Hessian in Deep Learning , 2016, ArXiv.
[18] Daniel Jiwoong Im,et al. An Empirical Analysis of Deep Network Loss Surfaces , 2016, ArXiv.
[19] BENEDICT LEIMKUHLER,et al. Adaptive Thermostats for Noisy Gradient Systems , 2015, SIAM J. Sci. Comput..
[20] Bharat Singh,et al. Layer-Specific Adaptive Learning Rates for Deep Networks , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).
[21] É. Moulines,et al. Non-asymptotic convergence analysis for the Unadjusted Langevin Algorithm , 2015, 1507.05021.
[22] B. Leimkuhler,et al. Molecular Dynamics: With Deterministic and Stochastic Numerical Methods , 2015 .
[23] Tianqi Chen,et al. Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.
[24] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[26] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[27] Oriol Vinyals,et al. Qualitatively characterizing neural network optimization problems , 2014, ICLR.
[28] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[29] B. Leimkuhler,et al. The computation of averages from equilibrium and nonequilibrium Langevin molecular dynamics , 2013, 1308.5814.
[30] Ryan Babbush,et al. Bayesian Sampling Using Stochastic Gradient Thermostats , 2014, NIPS.
[31] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[32] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[33] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[34] B. Leimkuhler,et al. Adaptive stochastic methods for sampling driven molecular systems. , 2011, The Journal of chemical physics.
[35] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.
[36] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[37] Davis E. King,et al. Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..
[38] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[39] Anthony Auerbach,et al. Observations on rate theory for rugged energy landscapes. , 2008, Biophysical journal.
[40] C. Mouhot,et al. Hypocoercivity for kinetic equations with linear relaxation terms , 2008, 0810.3493.
[41] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[42] Jonathan C. Mattingly,et al. Ergodicity for SDEs and approximations: locally Lipschitz vector fields and degenerate noise , 2002 .
[43] Arthur E. Hoerl,et al. Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.
[44] R. Tweedie,et al. Exponential convergence of Langevin distributions and their discrete approximations , 1996 .
[45] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[46] Peter M. Williams,et al. Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.
[47] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[48] S. Meyn,et al. Stability of Markovian processes II: continuous-time processes and sampled chains , 1993, Advances in Applied Probability.
[49] G. Parisi,et al. Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.
[50] C. Geyer. Markov Chain Monte Carlo Maximum Likelihood , 1991 .
[51] R. Zwanzig,et al. Diffusion in a rough potential. , 1988, Proceedings of the National Academy of Sciences of the United States of America.
[52] K. Vahala. Handbook of stochastic methods for physics, chemistry and the natural sciences , 1986, IEEE Journal of Quantum Electronics.
[53] C. W. Gardiner,et al. Handbook of stochastic methods - for physics, chemistry and the natural sciences, Second Edition , 1986, Springer series in synergetics.
[54] Hoover,et al. Canonical dynamics: Equilibrium phase-space distributions. , 1985, Physical review. A, General physics.
[55] S. Nosé. A unified formulation of the constant temperature molecular dynamics methods , 1984 .
[56] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.