暂无分享,去创建一个
[1] Jimmy Ba,et al. Kronecker-factored Curvature Approximations for Recurrent Neural Networks , 2018, ICLR.
[2] J. Dormand,et al. A family of embedded Runge-Kutta formulae , 1980 .
[3] Michael Flynn,et al. The UEA multivariate time series classification archive, 2018 , 2018, ArXiv.
[4] Bobbi Jo Broxson. The Kronecker Product , 2006 .
[5] David Duvenaud,et al. Latent ODEs for Irregularly-Sampled Time Series , 2019, ArXiv.
[6] Matthew J. Johnson,et al. Learning Differential Equations that are Easy to Solve , 2020, NeurIPS.
[7] Adam M. Oberman,et al. How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization , 2020, ICML.
[8] Pascal Vincent,et al. Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis , 2018, NeurIPS.
[9] Evangelos A. Theodorou,et al. Differential Dynamic Programming Neural Optimizer , 2020, ArXiv.
[10] Rajesh P. N. Rao,et al. Bayesian brain : probabilistic approaches to neural coding , 2006 .
[11] Christopher De Sa,et al. Neural Manifold Ordinary Differential Equations , 2020, NeurIPS.
[12] Anna Kazeykina,et al. Mean-field Langevin System, Optimal Control and Deep Neural Networks , 2019, ArXiv.
[13] Evangelos A. Theodorou,et al. Dynamic Game Theoretic Neural Optimizer , 2021, ICML.
[14] Matthias Gerdts,et al. Free finite horizon LQR: A bilevel perspective and its application to model predictive control , 2019, Autom..
[15] Yuval Tassa,et al. Stochastic Differential Dynamic Programming , 2010, Proceedings of the 2010 American Control Conference.
[16] William H. Press,et al. Numerical Recipes 3rd Edition: The Art of Scientific Computing , 2007 .
[17] Roger B. Grosse,et al. A Kronecker-factored approximate Fisher matrix for convolution layers , 2016, ICML.
[18] Sekhar Tatikonda,et al. MALI: A memory efficient and reverse accurate integrator for Neural ODEs , 2021, ICLR.
[19] David Barber,et al. Practical Gauss-Newton Optimisation for Deep Learning , 2017, ICML.
[20] E Weinan,et al. A Proposal on Machine Learning via Dynamical Systems , 2017, Communications in Mathematics and Statistics.
[21] Yoram Singer,et al. Shampoo: Preconditioned Stochastic Tensor Optimization , 2018, ICML.
[22] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[23] Long Chen,et al. Maximum Principle Based Algorithms for Deep Learning , 2017, J. Mach. Learn. Res..
[24] Wei Sun,et al. Model Based Reinforcement Learning with Final Time Horizon Optimization , 2015, ArXiv.
[25] Thomas Serre,et al. Go with the flow: Adaptive control for Neural ODEs , 2020, ICLR.
[26] Guodong Zhang,et al. Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks , 2019, NeurIPS.
[27] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[28] Pascal Vincent,et al. An Evaluation of Fisher Approximations Beyond Kronecker Factorization , 2018, ICLR.
[29] Richard G. Baraniuk,et al. InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers , 2019, ArXiv.
[30] Zidong Wang,et al. A Trace-restricted Kronecker-Factored Approximation to Natural Gradient , 2020, ArXiv.
[31] Amit Chakraborty,et al. Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control , 2020, ICLR.
[32] Roger B. Grosse,et al. Distributed Second-Order Optimization using Kronecker-Factored Approximations , 2016, ICLR.
[33] Nader Sadegh,et al. Infinite Horizon Nonlinear Quadratic Cost Regulator , 2019, 2019 American Control Conference (ACC).
[34] M. L. Chambers. The Mathematical Theory of Optimal Processes , 1965 .
[35] Kurt Keutzer,et al. ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs , 2019, IJCAI.
[36] Rong Ge,et al. Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks , 2020, ArXiv.
[37] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[38] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[39] Patrick Kidger,et al. "Hey, that's not an ODE": Faster ODE Adjoints with 12 Lines of Code , 2020, ArXiv.
[40] Hajime Asama,et al. Dissecting Neural ODEs , 2020, NeurIPS.
[41] Evangelos A. Theodorou,et al. Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective , 2019, ArXiv.
[42] E Weinan,et al. A mean-field optimal control formulation of deep learning , 2018, Research in the Mathematical Sciences.
[43] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[44] Kurt Keutzer,et al. Inefficiency of K-FAC for Large Batch Size Training , 2019, AAAI.
[45] Yuval Tassa,et al. Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[46] Philip H. S. Torr,et al. STEER : Simple Temporal Regularization For Neural ODEs , 2020, NeurIPS.
[47] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[48] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NeurIPS.
[49] Yann Le Cun,et al. A Theoretical Framework for Back-Propagation , 1988 .
[50] David Duvenaud,et al. Neural Ordinary Differential Equations , 2018, NeurIPS.
[51] Terry Lyons,et al. Neural Controlled Differential Equations for Irregular Time Series , 2020, NeurIPS.
[52] James Martens,et al. New Insights and Perspectives on the Natural Gradient Method , 2014, J. Mach. Learn. Res..
[53] Maximilian Nickel,et al. Riemannian Continuous Normalizing Flows , 2020, NeurIPS.
[54] David Duvenaud,et al. FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.
[55] Xingjian Li,et al. OT-Flow: Fast and Accurate Continuous Normalizing Flows via Optimal Transport , 2020, ArXiv.
[56] J. Duncan,et al. Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE , 2020, ICML.