A Geometric Integration Approach to Nonsmooth, Nonconvex Optimisation

The optimisation of nonsmooth, nonconvex functions without access to gradients is a particularly challenging problem that is frequently encountered, for example in model parameter optimisation problems. Bilevel optimisation of parameters is a standard setting in areas such as variational regularisation problems and supervised machine learning. We present efficient and robust derivative-free methods called randomised Itoh–Abe methods. These are generalisations of the Itoh–Abe discrete gradient method, a well-known scheme from geometric integration, which has previously only been considered in the smooth setting. We demonstrate that the method and its favourable energy dissipation properties are well defined in the nonsmooth setting. Furthermore, we prove that whenever the objective function is locally Lipschitz continuous, the iterates almost surely converge to a connected set of Clarke stationary points. We present an implementation of the methods, and apply it to various test problems. The numerical results indicate that the randomised Itoh–Abe methods can be superior to state-of-the-art derivative-free optimisation methods in solving nonsmooth problems while still remaining competitive in terms of efficiency.

[1]  T. Itoh,et al.  Hamiltonian-conserving discrete canonical equations based on variational difference quotients , 1988 .

[2]  E. Hairer,et al.  Geometric Numerical Integration: Structure Preserving Algorithms for Ordinary Differential Equations , 2004 .

[3]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[4]  I. Ekeland,et al.  Convex analysis and variational problems , 1976 .

[5]  Kristian Bredies,et al.  A TGV-Based Framework for Variational Image Decompression, Zooming, and Reconstruction. Part I: Analytics , 2015, SIAM J. Imaging Sci..

[6]  BolteJérôme,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems , 2010 .

[7]  Martin Burger,et al.  Level Set and PDE Based Reconstruction Methods in Imaging: Cetraro, Italy 2008, Editors: Martin Burger, Stanley Osher , 2013 .

[8]  Volker Grimm,et al.  Discrete gradient methods for solving variational image regularisation models , 2017 .

[9]  P. H. Quang,et al.  Generalized Convexity of Functions and Generalized Monotonicity of Set-Valued Maps , 1997 .

[10]  Yuto Miyatake,et al.  On the equivalence between SOR-type methods for linear systems and the discrete gradient methods for gradient systems , 2017, J. Comput. Appl. Math..

[11]  F. Clarke Necessary conditions for nonsmooth problems in-optimal control and the calculus of variations , 1991 .

[12]  Stefano Lucidi,et al.  A Linesearch-Based Derivative-Free Approach for Nonsmooth Constrained Optimization , 2014, SIAM J. Optim..

[13]  Yurii Nesterov,et al.  Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.

[14]  Michael L. Overton,et al.  Gradient Sampling Methods for Nonsmooth Optimization , 2018, Numerical Nonsmooth Optimization.

[15]  Mert Gürbüzbalaban,et al.  On Nesterov's Nonsmooth Chebyshev-Rosenbrock Functions , 2012 .

[16]  Stephen M. Robinson,et al.  Strongly Regular Generalized Equations , 1980, Math. Oper. Res..

[17]  A Survey of Clarke’s Subdifferential and the Differentiability of Locally Lipschitz Functions , 1999 .

[18]  Karl Kunisch,et al.  A Bilevel Optimization Approach for Parameter Learning in Variational Models , 2013, SIAM J. Imaging Sci..

[19]  Tao Wu,et al.  Bilevel Optimization for Calibrating Point Spread Functions in Blind Deconvolution , 2015 .

[20]  G. S. Turner,et al.  Discrete gradient methods for solving ODEs numerically while preserving a first integral , 1996 .

[21]  Sébastien Le Digabel,et al.  Algorithm xxx : NOMAD : Nonlinear Optimization with the MADS algorithm , 2010 .

[22]  Mark W. Schmidt,et al.  Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition , 2016, ECML/PKDD.

[23]  Coralia Cartis,et al.  Improving the Flexibility and Robustness of Model-based Derivative-free Optimization Solvers , 2018, ACM Trans. Math. Softw..

[24]  Charles Audet,et al.  Derivative-Free and Blackbox Optimization , 2017 .

[25]  Michael T. Heath,et al.  Scientific Computing: An Introductory Survey , 1996 .

[26]  K. Kiwiel Methods of Descent for Nondifferentiable Optimization , 1985 .

[27]  M. Powell The BOBYQA algorithm for bound constrained optimization without derivatives , 2009 .

[28]  Frank E. Curtis,et al.  An adaptive gradient sampling algorithm for non-smooth optimization , 2013, Optim. Methods Softw..

[29]  Elena Celledoni,et al.  Dissipative Numerical Schemes on Riemannian Manifolds with Applications to Gradient Flows , 2018, SIAM J. Sci. Comput..

[30]  H. H. Rosenbrock,et al.  An Automatic Method for Finding the Greatest or Least Value of a Function , 1960, Comput. J..

[31]  Matthias Joachim Ehrhardt,et al.  A geometric integration approach to smooth optimisation: Foundations of the discrete gradient method , 2018, 1805.06444.

[32]  Carola-Bibiane Schönlieb,et al.  Bilevel approaches for learning of variational imaging models , 2015, ArXiv.

[33]  R. Oeuvray,et al.  A New Derivative-Free Algorithm for the Medical Image Registration Problem , 2007 .

[34]  Carola-Bibiane Schönlieb,et al.  Variational Image Regularization with Euler's Elastica Using a Discrete Gradient Scheme , 2017, SIAM J. Imaging Sci..

[35]  Bryony DuPont,et al.  A hybrid extended pattern search/genetic algorithm for multi-stage wind farm optimization , 2016 .

[36]  Michael Lange,et al.  Automated Tiling of Unstructured Mesh Computations with Application to Seismological Modeling , 2017, ACM Trans. Math. Softw..

[37]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[38]  Stanley Osher,et al.  A Guide to the TV Zoo , 2013 .

[39]  Carola-Bibiane Schönlieb,et al.  Bilevel Parameter Learning for Higher-Order Total Variation Regularisation Models , 2015, Journal of Mathematical Imaging and Vision.

[40]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[41]  K. Truemper,et al.  Parallelized hybrid optimization methods for nonsmooth problems using NOMAD and linesearch , 2018 .

[42]  Kai Lai Chung,et al.  A Course in Probability Theory , 1949 .

[43]  Carola-Bibiane Schönlieb,et al.  Bregman Itoh–Abe Methods for Sparse Optimisation , 2019, Journal of Mathematical Imaging and Vision.

[44]  Bangti Jin,et al.  Inverse Problems , 2014, Series on Applied Mathematics.

[45]  Jérôme Henri Kämpf,et al.  Optimisation of building form for solar energy utilisation using constrained evolutionary algorithms , 2010 .

[46]  Krzysztof C. Kiwiel,et al.  A Nonderivative Version of the Gradient Sampling Algorithm for Nonsmooth Nonconvex Optimization , 2010, SIAM J. Optim..

[47]  Andreas Griewank,et al.  Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.

[48]  Hédy Attouch,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality , 2008, Math. Oper. Res..

[49]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[50]  G. Quispel,et al.  Foundations of Computational Mathematics: Six lectures on the geometric integration of ODEs , 2001 .

[51]  J. Borwein,et al.  Survey of subdifferential calculus with applications , 1999 .

[52]  Sebastian Reich,et al.  Discrete gradients for computational Bayesian inference , 2019 .

[53]  Martin Burger,et al.  Modern regularization methods for inverse problems , 2018, Acta Numerica.

[54]  Michael Hintermüller,et al.  A Proximal Bundle Method Based on Approximate Subgradients , 2001, Comput. Optim. Appl..

[55]  Tamara G. Kolda,et al.  Optimizing an Empirical Scoring Function for Transmembrane Protein Structure Determination , 2004, INFORMS J. Comput..

[56]  Li Tiancheng,et al.  アルゴリズム906: elrint3d―組み込み格子ルールのシーケンスを用いる三次元非適応自動立体求積法ルーチン , 2011 .

[57]  Karl Kunisch,et al.  Total Generalized Variation , 2010, SIAM J. Imaging Sci..

[58]  Didier Aussel Subdifferential Properties of Quasiconvex and Pseudoconvex Functions: Unified Approach , 1998 .

[59]  Stefan M. Wild,et al.  Derivative-free optimization methods , 2019, Acta Numerica.

[60]  Charles Audet,et al.  Mesh Adaptive Direct Search Algorithms for Constrained Optimization , 2006, SIAM J. Optim..

[61]  Charles Audet,et al.  Comparison of derivative-free optimization methods for groundwater supply and hydraulic capture community problems , 2008 .

[62]  Thomas Brox,et al.  Bilevel Optimization with Nonsmooth Lower Level Problems , 2015, SSVM.

[63]  Jonathan M. Borwein,et al.  A survey of subdifferential calculus with applications , 2002 .

[64]  Matthias J. Ehrhardt,et al.  Inexact Derivative-Free Optimization for Bilevel Learning , 2021, J. Math. Imaging Vis..

[65]  G. Quispel,et al.  Geometric integration using discrete gradients , 1999, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[66]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[67]  Frank E. Curtis,et al.  A quasi-Newton algorithm for nonconvex, nonsmooth optimization with global convergence guarantees , 2015, Math. Program. Comput..

[68]  Kristian Bredies,et al.  A TGV-Based Framework for Variational Image Decompression, Zooming, and Reconstruction. Part I: Analytics , 2015, SIAM J. Imaging Sci..

[69]  O. Gonzalez Time integration and discrete Hamiltonian systems , 1996 .

[70]  A. Bagirov,et al.  Discrete Gradient Method: Derivative-Free Method for Nonsmooth Optimization , 2008 .

[71]  M. Powell The NEWUOA software for unconstrained optimization without derivatives , 2006 .

[72]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.