On the Convergence Rate of Decomposable Submodular Function Minimization

Submodular functions describe a variety of discrete problems in machine learning, signal processing, and computer vision. However, minimizing submodular functions poses a number of algorithmic challenges. Recent work introduced an easy-to-use, parallelizable algorithm for minimizing submodular functions that decompose as the sum of "simple" submodular functions. Empirically, this algorithm performs extremely well, but no theoretical analysis was given. In this paper, we show that the algorithm converges linearly, and we provide upper and lower bounds on the rate of convergence. Our proof relies on the geometry of submodular polyhedra and draws on results from spectral graph theory.

[1]  Boris Polyak,et al.  The method of projections for finding the common point of convex sets , 1967 .

[2]  Martin Grötschel,et al.  The ellipsoid method and its consequences in combinatorial optimization , 1981, Comb..

[3]  J. J. Moré,et al.  On the identification of active constraints , 1988 .

[4]  Frank Deutsch,et al.  The Method of Alternating Orthogonal Projections , 1992 .

[5]  Heinz H. Bauschke,et al.  On the convergence of von Neumann's alternating projection algorithm for two sets , 1993 .

[6]  Hein Hundal,et al.  The rate of convergence of dykstra's cyclic projections algorithm: The polyhedral case , 1994 .

[7]  Heinz H. Bauschke,et al.  Dykstra's Alternating Projection Algorithm for Two Sets , 1994 .

[8]  Heinz H. Bauschke,et al.  On Projection Algorithms for Solving Convex Feasibility Problems , 1996, SIAM Rev..

[9]  Paul Tseng,et al.  Alternating Projection-Proximal Methods for Convex Programming and Variational Inequalities , 1997, SIAM J. Optim..

[10]  Jack Edmonds,et al.  Submodular Functions, Matroids, and Certain Polyhedra , 2001, Combinatorial Optimization.

[11]  C. SIAMJ. A FASTER SCALING ALGORITHM FOR MINIMIZING SUBMODULAR FUNCTIONS∗ , 2001 .

[12]  Merico E. Argentati,et al.  Principal Angles between Subspaces in an A-Based Scalar Product: Algorithms and Perturbation Estimates , 2001, SIAM J. Sci. Comput..

[13]  Robert M. Gray,et al.  Toeplitz and Circulant Matrices: A Review , 2005, Found. Trends Commun. Inf. Theory.

[14]  Robert M. Gray,et al.  Toeplitz And Circulant Matrices: A Review (Foundations and Trends(R) in Communications and Information Theory) , 2006 .

[15]  Vasilis Friderikos,et al.  Handbook of Discrete Optimization , 2006 .

[16]  Frank Deutsch,et al.  The rate of convergence for the cyclic projections algorithm I: Angles between convex sets , 2006, J. Approx. Theory.

[17]  Jeff A. Bilmes,et al.  Local Search for Balanced Submodular Clusterings , 2007, IJCAI.

[18]  Pushmeet Kohli,et al.  Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Vladimir Kolmogorov,et al.  Joint optimization of segmentation and appearance models , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Vikas Singh,et al.  An efficient algorithm for Co-segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  S. Fujishige,et al.  A Submodular Function Minimization Algorithm Based on the Minimum-Norm Base ⁄ , 2009 .

[22]  James B. Orlin,et al.  A faster strongly polynomial time algorithm for submodular function minimization , 2007, Math. Program..

[23]  P. Diaconis,et al.  Stochastic Alternating Projections , 2010 .

[24]  Francis R. Bach,et al.  Structured sparsity-inducing norms through submodular functions , 2010, NIPS.

[25]  Andreas Krause,et al.  Efficient Minimization of Decomposable Submodular Functions , 2010, NIPS.

[26]  Hui Lin,et al.  Optimal Selection of Limited Vocabulary Speech Corpora , 2011, INTERSPEECH.

[27]  Nikos Komodakis,et al.  MRF Energy Minimization and Beyond via Dual Decomposition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Hui Lin,et al.  On fast approximate submodular minimization , 2011, NIPS.

[29]  Julien Mairal,et al.  Proximal Methods for Hierarchical Sparse Coding , 2010, J. Mach. Learn. Res..

[30]  Vladimir Kolmogorov,et al.  Minimizing a sum of submodular functions , 2010, Discret. Appl. Math..

[31]  Amir Beck,et al.  On the Convergence of Block Coordinate Descent Type Methods , 2013, SIAM J. Optim..

[32]  Thorsten Joachims,et al.  Structured Learning of Sum-of-Submodular Higher Order Energy Functions , 2013, 2013 IEEE International Conference on Computer Vision.

[33]  Suvrit Sra,et al.  Reflection methods for user-friendly submodular optimization , 2013, NIPS.

[34]  Francis R. Bach,et al.  Learning with Submodular Functions: A Convex Optimization Perspective , 2011, Found. Trends Mach. Learn..

[35]  Heinz H. Bauschke,et al.  The rate of linear convergence of the Douglas-Rachford algorithm for subspaces is the cosine of the Friedrichs angle , 2013, J. Approx. Theory.

[36]  Karin Schwab,et al.  Best Approximation In Inner Product Spaces , 2016 .

[37]  U. Feige,et al.  Spectral Graph Theory , 2015 .