Optimization on manifolds: A symplectic approach

There has been great interest in using tools from dynamical systems and numerical analysis of differential equations to understand and construct new optimization methods. In particular, recently a new paradigm has emerged that applies ideas from mechanics and geometric integration to obtain accelerated optimization methods on Euclidean spaces. This has important consequences given that accelerated methods are the workhorses behind many machine learning applications. In this paper we build upon these advances and propose a framework for dissipative and constrained Hamiltonian systems that is suitable for solving optimization problems on arbitrary smooth manifolds. Importantly, this allows us to leverage the well-established theory of symplectic integration to derive “rate-matching” dissipative integrators. This brings a new perspective to optimization on manifolds whereby convergence guarantees follow by construction from classical arguments in symplectic geometry and backward error analysis. Moreover, we construct two dissipative generalizations of leapfrog that are straightforward to implement: one for Lie groups and homogeneous spaces, that relies on the tractable geodesic flow or a retraction thereof, and the other for constrained submanifolds that is based on a dissipative generalization of the famous RATTLE integrator. ar X iv :2 10 7. 11 23 1v 1 [ co nd -m at .s ta tm ec h] 2 3 Ju l 2 02 1

[1]  Antonio Orvieto,et al.  Momentum Improves Optimization on Riemannian Manifolds , 2021, AISTATS.

[2]  Andre Wibisono,et al.  A variational perspective on accelerated methods in optimization , 2016, Proceedings of the National Academy of Sciences.

[3]  Robert I. McLachlan,et al.  Geometric Generalisations of shake and rattle , 2012, Found. Comput. Math..

[4]  Mark J. Gotay,et al.  Presymplectic manifolds and the Dirac-Bergmann theory of constraints , 1978 .

[5]  Michael Betancourt,et al.  Bregman dynamics, contact transformations and convex optimization , 2019, Information Geometry.

[6]  H. C. Andersen Rattle: A “velocity” version of the shake algorithm for molecular dynamics calculations , 1983 .

[7]  Daniel P. Robinson,et al.  Gradient flows and proximal splitting methods: A unified view on accelerated and stochastic optimization. , 2019, Physical review. E.

[8]  Michael I. Jordan,et al.  On Symplectic Optimization , 2018, 1802.03653.

[9]  Suvrit Sra,et al.  First-order Methods for Geodesically Convex Optimization , 2016, COLT.

[10]  G. Wahba A Least Squares Estimate of Satellite Attitude , 1965 .

[11]  Michael I. Jordan,et al.  On Constraints in First-Order Optimization: A View from Non-Smooth Dynamical Systems , 2021, ArXiv.

[12]  Michael I. Jordan,et al.  On dissipative symplectic integration with applications to gradient-based optimization , 2020 .

[13]  Anders C. Hansen,et al.  A theoretical framework for backward error analysis on manifolds , 2011 .

[14]  Alexandros Beskos,et al.  Manifold Markov chain Monte Carlo methods for Bayesian inference in a wide class of diffusion models , 2019 .

[15]  E. Hairer,et al.  Geometric Numerical Integration: Structure Preserving Algorithms for Ordinary Differential Equations , 2004 .

[16]  S. Reich Symplectic integration of constrained Hamiltonian systems by composition methods , 1996 .

[17]  G. Quispel,et al.  Splitting methods , 2002, Acta Numerica.

[18]  Daniel P. Robinson,et al.  Conformal symplectic and relativistic optimization , 2019, NeurIPS.

[19]  G. Quispel,et al.  Geometric integrators for ODEs , 2006 .

[20]  J. Marsden,et al.  Discrete mechanics and variational integrators , 2001, Acta Numerica.

[21]  G. Quispel,et al.  Geometric integration using discrete gradients , 1999, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[22]  B. Leimkuhler,et al.  Symplectic Numerical Integrators in Constrained Hamiltonian Systems , 1994 .

[23]  Jerrold E. Marsden,et al.  Introduction to Mechanics and Symmetry: A Basic Exposition of Classical Mechanical Systems , 1999 .

[24]  Hamiltonian Monte Carlo on Symmetric and Homogeneous Spaces via Symplectic Reduction , 2019, 1903.02699.

[25]  Michael I. Jordan,et al.  Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives , 2020, ArXiv.

[26]  Peter G. Bergmann,et al.  Dirac bracket transformations in phase space , 1955 .

[27]  Gabriel Stoltz,et al.  Correction to: Hybrid Monte Carlo methods for sampling probability measures on submanifolds , 2018, Numerische Mathematik.

[28]  Brynjulf Owren,et al.  Geometric integration of non-autonomous Hamiltonian problems , 2014, 1409.5058.

[29]  Niklas Koep,et al.  Pymanopt: A Python Toolbox for Optimization on Manifolds using Automatic Differentiation , 2016, J. Mach. Learn. Res..

[30]  J. Marsden,et al.  Lie-Poisson Hamilton-Jacobi theory and Lie-Poisson integrators , 1988 .

[31]  Matthew M. Graham,et al.  Manifold lifting: scaling MCMC to the vanishing noise regime , 2020, 2003.03950.

[32]  Suvrit Sra,et al.  From Nesterov's Estimate Sequence to Riemannian Acceleration , 2020, COLT.

[33]  Levent Tunçel,et al.  Optimization algorithms on matrix manifolds , 2009, Math. Comput..

[34]  Benedict Leimkuhler,et al.  Efficient molecular dynamics using geodesic integration and solvent–solute splitting , 2016, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[35]  Paul Adrien Maurice Dirac Generalized Hamiltonian dynamics , 1950 .

[36]  José F. Cariñena,et al.  Generalized canonical transformations for time‐dependent systems , 1983 .

[37]  E. Hairer,et al.  Simulating Hamiltonian dynamics , 2006, Math. Comput..