暂无分享,去创建一个
[1] Brian Kingsbury,et al. Beyond Backprop: Alternating Minimization with co-Activation Memory , 2018, ArXiv.
[2] E Weinan,et al. A Proposal on Machine Learning via Dynamical Systems , 2017, Communications in Mathematics and Statistics.
[3] Stephen Berard,et al. Implications of Historical Trends in the Electrical Efficiency of Computing , 2011, IEEE Annals of the History of Computing.
[4] Lars Ruthotto,et al. Layer-Parallel Training of Deep Residual Neural Networks , 2018, SIAM J. Math. Data Sci..
[5] Yann Le Cun,et al. A Theoretical Framework for Back-Propagation , 1988 .
[6] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[7] Hector O. Fattorini,et al. Infinite Dimensional Optimization and Control Theory: References , 1999 .
[8] Katya Scheinberg,et al. Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning , 2017, ArXiv.
[9] Yuan Yao,et al. A Proximal Block Coordinate Descent Algorithm for Deep Neural Network Training , 2018, ICLR.
[10] Eldad Haber,et al. Stable architectures for deep neural networks , 2017, ArXiv.
[11] Barak A. Pearlmutter,et al. Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..
[12] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[13] J. Lambert. Numerical Methods for Ordinary Differential Equations , 1991 .
[14] Bin Gu,et al. Decoupled Parallel Backpropagation with Convergence Guarantee , 2018, ICML.
[15] Yuan Yao,et al. Block Coordinate Descent for Deep Learning: Unified Convergence Guarantees , 2018, ArXiv.
[16] Miguel Á. Carreira-Perpiñán,et al. Distributed optimization of deeply nested systems , 2012, AISTATS.
[17] K. Steinhubl. Design of Ion-Implanted MOSFET'S with Very Small Physical Dimensions , 1974 .
[18] David Duvenaud,et al. Neural Ordinary Differential Equations , 2018, NeurIPS.
[19] Brian Kingsbury,et al. Beyond Backprop: Online Alternating Minimization with Auxiliary Variables , 2018, ICML.
[20] Long Chen,et al. Maximum Principle Based Algorithms for Deep Learning , 2017, J. Mach. Learn. Res..
[21] Martin J. Gander,et al. Nonlinear Convergence Analysis for the Parareal Algorithm , 2008 .
[22] M. Bardi,et al. Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations , 1997 .
[23] Alex Graves,et al. Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.
[24] William L. Briggs,et al. A multigrid tutorial , 1987 .