On dissipative symplectic integration with applications to gradient-based optimization

Recently, continuous-time dynamical systems have proved useful in providing conceptual and quantitative insights into gradient-based optimization, widely used in modern machine learning and statistics. An important question that arises in this line of work is how to discretize the system in such a way that its stability and rates of convergence are preserved. In this paper we propose a geometric framework in which such discretizations can be realized systematically, enabling the derivation of ‘rate-matching’ algorithms without the need for a discrete convergence analysis. More specifically, we show that a generalization of symplectic integrators to non-conservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error. Moreover, such methods preserve a shadow Hamiltonian despite the absence of a conservation law, extending key results of symplectic integrators to non-conservative cases. Our arguments rely on a combination of backward error analysis with fundamental results from symplectic geometry. We stress that although the original motivation for this work was the application to optimization, where dissipative systems play a natural role, they are fully general and not only provide a differential geometric framework for dissipative Hamiltonian systems but also substantially extend the theory of structure-preserving integration.

[1]  Daniel P. Robinson,et al.  Relax, and Accelerate: A Continuous Perspective on ADMM , 2018 .

[2]  Michael I. Jordan,et al.  Acceleration via Symplectic Discretization of High-Resolution Differential Equations , 2019, NeurIPS.

[3]  R. McLachlan,et al.  Conformal Hamiltonian systems , 2001 .

[4]  Stam Nicolis,et al.  Dynamic magnetostriction for antiferromagnets , 2019, Physical Review B.

[5]  E. Hairer Backward analysis of numerical integrators and symplectic methods , 1994 .

[6]  BhattAshish,et al.  Second Order Conformal Symplectic Schemes for Damped Hamiltonian Systems , 2016 .

[7]  Alexandre d'Aspremont,et al.  Integration Methods and Optimization Algorithms , 2017, NIPS.

[8]  Michael I. Jordan,et al.  Generalized Momentum-Based Methods: A Hamiltonian Perspective , 2019, SIAM J. Optim..

[9]  Anthony J Leggett,et al.  Influence of Dissipation on Quantum Tunneling in Macroscopic Systems , 1981 .

[10]  G. França,et al.  A Nonsmooth Dynamical Systems Perspective on Accelerated Extensions of ADMM , 2018, IEEE Transactions on Automatic Control.

[11]  Matthias Troyer,et al.  Solving the quantum many-body problem with artificial neural networks , 2016, Science.

[12]  Robert I. McLachlan Families of High-Order Composition Methods , 2004, Numerical Algorithms.

[13]  B. Leimkuhler,et al.  Simulating Hamiltonian Dynamics , 2005 .

[14]  Brian E. Moore,et al.  Exponential integrators preserving local conservation laws of PDEs with time-dependent damping/driving forces , 2019, J. Comput. Appl. Math..

[15]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[16]  Boris Polyak Some methods of speeding up the convergence of iteration methods , 1964 .

[17]  Daniel P. Robinson,et al.  Conformal symplectic and relativistic optimization , 2019, NeurIPS.

[18]  Michael I. Jordan,et al.  Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives , 2020, ArXiv.

[19]  E. Kanai On the Quantization of the Dissipative Systems , 1948 .

[20]  Molei Tao,et al.  Explicit symplectic approximation of nonseparable Hamiltonians: Algorithm and long time performance. , 2016, Physical review. E.

[21]  Michael I. Jordan,et al.  A Lyapunov Analysis of Momentum Methods in Optimization , 2016, ArXiv.

[22]  Nicolò Cesa-Bianchi,et al.  Advances in Neural Information Processing Systems 31 , 2018, NIPS 2018.

[23]  Alexandre M. Bayen,et al.  Accelerated Mirror Descent in Continuous and Discrete Time , 2015, NIPS.

[24]  J. Cirac,et al.  Restricted Boltzmann machines in quantum physics , 2019, Nature Physics.

[25]  V. Arnold,et al.  Mathematical aspects of classical and celestial mechanics , 1997 .

[26]  G. Quispel,et al.  Geometric integrators for ODEs , 2006 .

[27]  Michael I. Jordan,et al.  A Dynamical Systems Perspective on Nesterov Acceleration , 2019, ICML.

[28]  Michael I. Jordan,et al.  On Symplectic Optimization , 2018, 1802.03653.

[29]  S. Yau,et al.  Lectures on Differential Geometry , 1994 .

[30]  M. Suzuki,et al.  Fractal decomposition of exponential operators with applications to many-body theories and Monte Carlo simulations , 1990 .

[31]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[32]  Andre Wibisono,et al.  A variational perspective on accelerated methods in optimization , 2016, Proceedings of the National Academy of Sciences.

[33]  Kai Cieliebak,et al.  Symplectic Geometry , 1992, New Spaces in Physics.

[34]  S. Reich Backward Error Analysis for Numerical Integrators , 1999 .

[35]  E. Hairer,et al.  Geometric Numerical Integration , 2022, Oberwolfach Reports.

[36]  Xiaocheng Shang,et al.  Structure-preserving integrators for dissipative systems based on reversible– irreversible splitting , 2018, Proceedings of the Royal Society A.

[37]  Brian E. Moore Multi-conformal-symplectic PDEs and discretizations , 2017, J. Comput. Appl. Math..

[38]  Brian E. Moore,et al.  Second Order Conformal Symplectic Schemes for Damped Hamiltonian Systems , 2016, J. Sci. Comput..

[39]  Ralph Abraham,et al.  Foundations Of Mechanics , 2019 .

[40]  S. Nicolis,et al.  Probing magneto-elastic phenomena through an effective spin-bath coupling model , 2018, European Physical Journal B : Condensed Matter Physics.

[41]  Roger G. Melko,et al.  Learning Thermodynamics with Boltzmann Machines , 2016, ArXiv.

[42]  H. Yoshida Construction of higher order symplectic integrators , 1990 .

[43]  Stephen P. Boyd,et al.  A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[44]  J. M. Sanz-Serna,et al.  Symplectic integrators for Hamiltonian problems: an overview , 1992, Acta Numerica.

[45]  Arieh Iserles,et al.  Why Geometric Numerical Integration , 2018 .

[46]  L. Einkemmer Structure preserving numerical methods for the Vlasov equation , 2016, 1604.02616.

[47]  Ernst Hairer,et al.  The life-span of backward error analysis for numerical integrators , 1997 .

[48]  Aryan Mokhtari,et al.  Direct Runge-Kutta Discretization Achieves Acceleration , 2018, NeurIPS.

[49]  Daniel P. Robinson,et al.  ADMM and Accelerated ADMM as Continuous Dynamical Systems , 2018, ICML.

[50]  P. Caldirola,et al.  Forze non conservative nella meccanica quantistica , 1941 .

[51]  G. Benettin,et al.  On the Hamiltonian interpolation of near-to-the identity symplectic mappings with application to symplectic integration algorithms , 1994 .

[52]  Rolf Berndt,et al.  An introduction to symplectic geometry , 2000 .