On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization

Some recent works in machine learning and computer vision involve the solution of a bi-level optimization problem. Here the solution of a parameterized lower-level problem binds variables that appear in the objective of an upper-level problem. The lower-level problem typically appears as an argmin or argmax optimization problem. Many techniques have been proposed to solve bi-level optimization problems, including gradient descent, which is popular with current end-to-end learning approaches. In this technical report we collect some results on differentiating argmin and argmax optimization problems with and without constraints and provide some insightful motivating examples.

[1]  Yoshiaki Shirai,et al.  Three-Dimensional Computer Vision , 1987, Symbolic Computation.

[2]  Jonathan F. Bard,et al.  Practical Bilevel Optimization: Algorithms and Applications , 1998 .

[3]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[4]  Chuan-Sheng Foo,et al.  Efficient multiple hyperparameter learning for log-linear models , 2007, NIPS.

[5]  Marshall F. Tappen,et al.  Learning optimized MAP estimates in continuously-valued MRF models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Justin Domke,et al.  Generic Methods for Optimization-Based Modeling , 2012, AISTATS.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[9]  Thomas Pock,et al.  Continuous Hyper-parameter Learning for Support Vector Machines , 2015 .

[10]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[11]  Thomas Brox,et al.  Bilevel Optimization with Nonsmooth Lower Level Problems , 2015, SSVM.

[12]  S. Dempe,et al.  On the solution of convex bilevel optimization problems , 2016, Comput. Optim. Appl..

[13]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[14]  Basura Fernando,et al.  Learning End-to-end Video Classification with Rank-Pooling , 2016, ICML.

[15]  Marcus Hutter,et al.  Discriminative Hierarchical Rank Pooling for Activity Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).