Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs
暂无分享,去创建一个
Alexandr Katrutsa | Andrzej Cichocki | Ivan Oseledets | Talgat Daulbaev | Larisa Markeeva | Julia Gusak
[1] Kurt Keutzer,et al. ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs , 2019, IJCAI.
[2] Bin Dong,et al. Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations , 2017, ICML.
[3] Andreas Fichtner,et al. The adjoint method in seismology – I. Theory , 2006 .
[4] Niles A. Pierce,et al. An Introduction to the Adjoint Approach to Design , 2000 .
[5] Lloyd N. Trefethen,et al. Barycentric Lagrange Interpolation , 2004, SIAM Rev..
[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] L. Shampine. Interpolation for Runge–Kutta Methods , 1985 .
[8] Sergey Pavlov,et al. “Zhores” — Petaflops supercomputer for data-driven modeling, machine learning and artificial intelligence installed in Skolkovo Institute of Science and Technology , 2019, Open Engineering.
[9] Richard G. Baraniuk,et al. InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers , 2019, ArXiv.
[10] R.M.M. Mattheij,et al. Stability and asymptotic estimates in nonautonomous linear differential systems , 1985 .
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Yee Whye Teh,et al. Augmented Neural ODEs , 2019, NeurIPS.
[13] Kurt Keutzer,et al. ANODEV2: A Coupled Neural ODE Evolution Framework , 2019, ArXiv.
[14] J. Dormand,et al. A family of embedded Runge-Kutta formulae , 1980 .
[15] Eldad Haber,et al. Deep Neural Networks Motivated by Partial Differential Equations , 2018, Journal of Mathematical Imaging and Vision.
[16] B. Roe,et al. Boosted decision trees as an alternative to artificial neural networks for particle identification , 2004, physics/0408124.
[17] Guriĭ Ivanovich Marchuk,et al. Adjoint Equations and Analysis of Complex Systems , 1995 .
[18] Daniel J. Arrigo,et al. An Introduction to Partial Differential Equations , 2017, An Introduction to Partial Differential Equations.
[19] L. Shampine,et al. Some practical Runge-Kutta formulas , 1986 .
[20] Frederick Tung,et al. Multi-level Residual Networks from Dynamical Systems View , 2017, ICLR.
[21] Jonathan Masci,et al. SNODE: Spectral Discretization of Neural ODEs for System Identification , 2020, ICLR.
[22] G. Söderlind,et al. The logarithmic norm. History and modern theory , 2006 .
[23] Alexandr Katrutsa,et al. Towards Understanding Normalization in Neural ODEs , 2020, ICLR 2020.
[24] David Duvenaud,et al. Neural Ordinary Differential Equations , 2018, NeurIPS.
[25] David Duvenaud,et al. Latent Ordinary Differential Equations for Irregularly-Sampled Time Series , 2019, NeurIPS.
[26] Bengt Fornberg,et al. A practical guide to pseudospectral methods: Introduction , 1996 .
[27] R. Plessix. A review of the adjoint-state method for computing the gradient of a functional with geophysical applications , 2006 .
[28] E. Hairer,et al. Solving Ordinary Differential Equations II , 2010 .
[29] David Duvenaud,et al. FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.
[30] Ali Ramadhan,et al. Universal Differential Equations for Scientific Machine Learning , 2020, ArXiv.
[31] N. Higham. The numerical stability of barycentric Lagrange interpolation , 2004 .
[32] Diederik P. Kingma,et al. Stochastic Gradient VB and the Variational Auto-Encoder , 2013 .
[33] M. C. Hall,et al. Application of adjoint sensitivity theory to an atmospheric general circulation model , 1986 .
[34] R. Serban,et al. CVODES: The Sensitivity-Enabled ODE Solver in SUNDIALS , 2005 .
[35] Manuel Calvo,et al. Stiffness 1952–2012: Sixty years in search of a definition , 2015 .