AdaBiM: An adaptive proximal gradient method for structured convex bilevel optimization

Bilevel optimization is a comprehensive framework that bridges single- and multi-objective optimization. It encompasses many general formulations, including, but not limited to, standard nonlinear programs. This work demonstrates how elementary proximal gradient iterations can be used to solve a wide class of convex bilevel optimization problems without involving subroutines. Compared to and improving upon existing methods, ours (1) can handle a wider class of problems, including nonsmooth terms in the upper and lower level problems, (2) does not require strong convexity or global Lipschitz gradient continuity assumptions, and (3) provides a systematic adaptive stepsize selection strategy, allowing for the use of large stepsizes while being insensitive to the choice of parameters.

[1]  Wen Song,et al.  A first-order method for solving bilevel convex optimization problems in Banach space , 2023, Optimization.

[2]  Zhaoran Wang,et al.  A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic , 2023, SIAM J. Optim..

[3]  Panagiotis Patrinos,et al.  Adaptive proximal algorithms for convex optimization under local Lipschitz continuity of the gradient , 2023, ArXiv.

[4]  Yura Malitsky,et al.  Beyond the Golden Ratio for Variational Inequality Algorithms , 2022, J. Mach. Learn. Res..

[5]  Shimrit Shtern,et al.  Methodology and first-order algorithms for solving nonsmooth and non-strongly convex bilevel optimization problems , 2022, Mathematical Programming.

[6]  J. Mairal,et al.  Non-Convex Bilevel Games with Critical Point Selection Maps , 2022, NeurIPS.

[7]  Aryan Mokhtari,et al.  A Conditional Gradient-based Method for Simple Bilevel Optimization with Convex Lower-level Problem , 2022, AISTATS.

[8]  M. Pontil,et al.  Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-Start , 2022, J. Mach. Learn. Res..

[9]  A. Bohm,et al.  Solving Nonconvex-Nonconcave Min-Max Problems exhibiting Weak Minty Solutions , 2022, Trans. Mach. Learn. Res..

[10]  J. Mairal,et al.  Amortized Implicit Differentiation for Stochastic Bilevel Optimization , 2021, ICLR.

[11]  Volkan Cevher,et al.  A first-order primal-dual method with adaptivity to local smoothness , 2021, NeurIPS.

[12]  Xiaoming Yuan,et al.  A Value-Function-based Interior-point Method for Non-convex Bi-level Optimization , 2021, ICML.

[13]  Lorenzo Lampariello,et al.  Combining approximation and exact penalty in hierarchical programming , 2021, Optimization.

[14]  Lorenzo Lampariello,et al.  On the solution of monotone nested variational inequalities , 2021, Mathematical Methods of Operations Research.

[15]  Xiaokai Chang,et al.  Golden Ratio Primal-Dual Algorithm with Linesearch , 2021, SIAM J. Optim..

[16]  Prashant Khanduri,et al.  A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum , 2021, NeurIPS.

[17]  W. Yin,et al.  A Single-Timescale Method for Stochastic Bilevel Optimization , 2021, AISTATS.

[18]  Farzad Yousefian,et al.  A Method with Convergence Rates for Optimization Problems with Variational Inequality Constraints , 2020, SIAM J. Optim..

[19]  Zhaoran Wang,et al.  A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic , 2020, SIAM J. Optim..

[20]  Andreas Krause,et al.  Coresets via Bilevel Optimization for Continual Learning and Streaming , 2020, NeurIPS.

[21]  Phan Tu Vuong,et al.  A Relaxed Inertial Forward-Backward-Forward Algorithm for Solving Monotone Inclusions with Application to GANs , 2020, J. Mach. Learn. Res..

[22]  Themistocles M. Rassias,et al.  Self adaptive inertial subgradient extragradient algorithms for solving pseudomonotone variational inequality problems , 2019, Optimization Letters.

[23]  Haishan Ye,et al.  Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems , 2020, NeurIPS.

[24]  Konstantin Mishchenko,et al.  Adaptive gradient descent without descent , 2019, ICML.

[25]  Sergey Levine,et al.  Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[26]  Jun Yang Self-adaptive inertial subgradient extragradient algorithm for solving pseudomonotone variational inequalities , 2019, Applicable Analysis.

[27]  Panagiotis Patrinos,et al.  On the Acceleration of Forward-Backward Splitting via an Inexact Newton Method , 2018, Splitting Algorithms, Modern Operator Theory, and Applications.

[28]  Hongwei Liu,et al.  A Modified Projected Gradient Method for Monotone Variational Inequalities , 2018, J. Optim. Theory Appl..

[29]  Paolo Frasconi,et al.  Bilevel Programming for Hyperparameter Optimization and Meta-Learning , 2018, ICML.

[30]  Yura Malitsky,et al.  Golden ratio algorithms for variational inequalities , 2018, Mathematical Programming.

[31]  Saeed Ghadimi,et al.  Approximation Methods for Bilevel Programming , 2018, 1802.02246.

[32]  Amir Beck,et al.  First-Order Methods in Optimization , 2017 .

[33]  Shimrit Shtern,et al.  A First Order Method for Solving Convex Bilevel Optimization Problems , 2017, SIAM J. Optim..

[34]  Lorenzo Rosasco,et al.  Iterative Regularization via Dual Diagonal Descent , 2016, Journal of Mathematical Imaging and Vision.

[35]  Anoop Cherian,et al.  On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization , 2016, ArXiv.

[36]  Fabian Pedregosa,et al.  Hyperparameter optimization with approximate gradient , 2016, ICML.

[37]  Amir Beck,et al.  A first order method for finding minimal norm-like solutions of convex optimization problems , 2014, Math. Program..

[38]  Francisco Facchinei,et al.  VI-constrained hemivariational inequalities: distributed algorithms and power control in ad-hoc networks , 2013, Mathematical Programming.

[39]  Juan Peypouquet,et al.  Coupling the Gradient Method with a General Exterior Penalization Scheme for Convex Minimization , 2012, J. Optim. Theory Appl..

[40]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[41]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[42]  Alexandre Cabot,et al.  Proximal Point Algorithm Controlled by a Slowly Vanishing Term: Applications to Hierarchical Minimization , 2005, SIAM J. Optim..

[43]  Hong-Kun Xu VISCOSITY APPROXIMATION METHODS FOR NONEXPANSIVE MAPPINGS , 2004 .

[44]  Stephan Dempe,et al.  Foundations of Bilevel Programming , 2002 .

[45]  Hédy Attouch,et al.  Viscosity Solutions of Minimization Problems , 1996, SIAM J. Optim..

[46]  B. Lemaire,et al.  Convergence of diagonally stationary sequences in convex optimization , 1994 .

[47]  Yingbin Liang,et al.  A Constrained Optimization Approach to Bilevel Optimization with Multiple Inner Minima , 2022, ArXiv.

[48]  Stephan Dempe,et al.  Bilevel Optimization: Theory, Algorithms, Applications and a Bibliography , 2020, Bilevel Optimization.

[49]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.

[50]  Y. Nesterov Gradient methods for minimizing composite functions , 2013, Math. Program..

[51]  Mikhail Solodov,et al.  An Explicit Descent Method for Bilevel Convex Optimization , 2006 .

[52]  A. Moudafi Viscosity Approximation Methods for Fixed-Points Problems , 2000 .