Overleaf Example

We propose an algorithm to solve a class of bi-level optimization problems using only first-order information. In particular, we focus on a class where the inner minimization has unique solutions. Unlike contemporary algorithms, our algorithm does not require the use of an oracle estimator for the gradient of the bi-level objective or an approximate solver for the inner problem. Instead, we alternate between descending on the inner problem using naïve optimization methods and descending on the upper-level objective function using specially constructed gradient estimators. We provide non-asymptotic convergence rates to stationary points of the bi-level objective in the absence of convexity of the closed-loop function and further show asymptotic convergence to only local minima of the bi-level problem. The approach is inspired by ideas from the literature on two-timescale stochastic approximation algorithms.

[1]  S. Wright,et al.  BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach , 2022, NeurIPS.

[2]  Zhaoran Wang,et al.  Differentiable Bilevel Programming for Stackelberg Congestion Games , 2022, arXiv.org.

[3]  Stephen J. Wright,et al.  Optimization for Data Analysis , 2022 .

[4]  Hoi-To Wai,et al.  Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence , 2021, NeurIPS.

[5]  Risheng Liu,et al.  Towards Gradient-based Bilevel Optimization with Non-convex Followers and Beyond , 2021, NeurIPS.

[6]  Eric V. Mazumdar,et al.  Zeroth-Order Methods for Convex-Concave Minmax Problems: Applications to Decision-Dependent Risk Minimization , 2021, AISTATS.

[7]  Xiaoming Yuan,et al.  A Value-Function-based Interior-point Method for Non-convex Bi-level Optimization , 2021, ICML.

[8]  Deyu Meng,et al.  Investigating Bi-Level Optimization for Learning and Vision From a Unified Perspective: A Survey and Beyond , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Franziska Meier,et al.  Model-Based Inverse Reinforcement Learning from Visual Demonstrations , 2020, CoRL.

[10]  Kaiyi Ji,et al.  Bilevel Optimization: Convergence Analysis and Enhanced Design , 2020, ICML.

[11]  Hoi-To Wai,et al.  A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic , 2020, ArXiv.

[12]  Massimiliano Pontil,et al.  On the Iteration Complexity of Hypergradient Computation , 2020, ICML.

[13]  Xiaoming Yuan,et al.  A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton , 2020, ICML.

[14]  Michael I. Jordan,et al.  Near-Optimal Algorithms for Minimax Optimization , 2020, COLT.

[15]  Yurii Nesterov,et al.  Lectures on Convex Optimization , 2018 .

[16]  Paolo Frasconi,et al.  Bilevel Programming for Hyperparameter Optimization and Meta-Learning , 2018, ICML.

[17]  Saeed Ghadimi,et al.  Approximation Methods for Bilevel Programming , 2018, 1802.02246.

[18]  Fengqi You,et al.  Stackelberg-game-based modeling and optimization for supply chain design and operations: A mixed integer bilevel programming framework , 2017, Comput. Chem. Eng..

[19]  Kalyanmoy Deb,et al.  A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications , 2017, IEEE Transactions on Evolutionary Computation.

[20]  Paolo Frasconi,et al.  Forward and Reverse Gradient-Based Hyperparameter Optimization , 2017, ICML.

[21]  Anoop Cherian,et al.  On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization , 2016, ArXiv.

[22]  Michael I. Jordan,et al.  Gradient Descent Only Converges to Minimizers , 2016, COLT.

[23]  Fabian Pedregosa,et al.  Hyperparameter optimization with approximate gradient , 2016, ICML.

[24]  R. Rockafellar,et al.  Implicit Functions and Solution Mappings , 2009 .

[25]  Patrice Marcotte,et al.  An overview of bilevel optimization , 2007, Ann. Oper. Res..

[26]  Adam Tauman Kalai,et al.  Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.

[27]  John M. Lee Introduction to Smooth Manifolds , 2002 .

[28]  Stephan Dempe,et al.  Foundations of Bilevel Programming , 2002 .

[29]  Jonathan F. Bard,et al.  Practical Bilevel Optimization: Algorithms and Applications , 1998 .

[30]  Paul H. Calamai,et al.  Bilevel and multilevel programming: A bibliography review , 1994, J. Glob. Optim..

[31]  J. Zhang,et al.  On Bilevel Optimization without Lower-level Strong Convexity , 2023, ArXiv.

[32]  Yingbin Liang,et al.  A Constrained Optimization Approach to Bilevel Optimization with Multiple Inner Minima , 2022, ArXiv.

[33]  James C. Spall,et al.  A one-measurement form of simultaneous perturbation stochastic approximation , 1997, Autom..

[34]  Heinrich von Stackelberg,et al.  Stackelberg (Heinrich von) - The Theory of the Market Economy, translated from the German and with an introduction by Alan T. PEACOCK. , 1953 .