A Conditional Gradient Framework for Composite Convex Minimization with Applications to Semidefinite Programming

We propose a conditional gradient framework for a composite convex minimization template with broad applications. Our approach combines smoothing and homotopy techniques under the CGM framework, and provably achieves the optimal $\mathcal{O}(1/\sqrt{k})$ convergence rate. We demonstrate that the same rate holds if the linear subproblems are solved approximately with additive or multiplicative error. In contrast with the relevant work, we are able to characterize the convergence when the non-smooth term is an indicator function. Specific applications of our framework include the non-smooth minimization, semidefinite programming, and minimization with linear inclusion constraints over a compact domain. Numerical evidence demonstrates the benefits of our framework.

[1]  J. Neumann,et al.  Theory of games and economic behavior , 1945, 100 Years of Math Milestones.

[2]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[3]  Boris Polyak,et al.  Constrained minimization methods , 1966 .

[4]  J. Dunn,et al.  Conditional gradient algorithms with open loop step size rules , 1978 .

[5]  J. Dunn Rates of convergence for conditional gradient algorithms near singular and nonsingular extremals , 1979, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[6]  J. Dunn Convergence Rates for Conditional Gradient Sequences Generated by Implicit Step Length Rules , 1980 .

[7]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[8]  J. Hammond Solving asymmetric variational inequality problems and systems of equations with generalized nonlinear programming algorithms , 1984 .

[9]  Patrice Marcotte,et al.  Some comments on Wolfe's ‘away step’ , 1986, Math. Program..

[10]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[11]  Marc Teboulle,et al.  A conditional gradient method with linear rate of convergence for solving convex linear systems , 2004, Math. Methods Oper. Res..

[12]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[13]  Ben Taskar,et al.  Structured Prediction, Dual Extragradient and Bregman Projections , 2006, J. Mach. Learn. Res..

[14]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[15]  Jiming Peng,et al.  Advanced Optimization Laboratory Title : Approximating K-means-type clustering via semidefinite programming , 2005 .

[16]  Elad Hazan,et al.  Sparse Approximate Solutions to Semidefinite Programs , 2008, LATIN.

[17]  Kenneth L. Clarkson,et al.  Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.

[18]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Akiko Yoshise,et al.  On optimization over the doubly nonnegative cone , 2010, 2010 IEEE International Symposium on Computer-Aided Control System Design.

[20]  Nicolas Vayatis,et al.  Estimation of Simultaneously Sparse and Low Rank Matrices , 2012, ICML.

[21]  Elad Hazan,et al.  Projection-free Online Learning , 2012, ICML.

[22]  S. Low,et al.  Zero Duality Gap in Optimal Power Flow Problem , 2012, IEEE Transactions on Power Systems.

[23]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[24]  Guanghui Lan The Complexity of Large-scale Convex Programming under a Linear Optimization Oracle , 2013, 1309.5550.

[25]  Mark W. Schmidt,et al.  Block-Coordinate Frank-Wolfe Optimization for Structural SVMs , 2012, ICML.

[26]  Justin K. Romberg,et al.  Blind Deconvolution Using Convex Programming , 2012, IEEE Transactions on Information Theory.

[27]  Elad Hazan,et al.  Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets , 2014, ICML.

[28]  Zaïd Harchaoui,et al.  Semi-Proximal Mirror-Prox for Nonsmooth Composite Minimization , 2015, NIPS.

[29]  Martin Jaggi,et al.  On the Global Linear Convergence of Frank-Wolfe Optimization Variants , 2015, NIPS.

[30]  Volkan Cevher,et al.  A Universal Primal-Dual Convex Optimization Framework , 2015, NIPS.

[31]  Zaïd Harchaoui,et al.  Conditional gradient algorithms for norm-regularized smooth convex optimization , 2013, Math. Program..

[32]  Kim-Chuan Toh,et al.  SDPNAL$$+$$+: a majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints , 2014, Math. Program. Comput..

[33]  Paul Grigas,et al.  New analysis and results for the Frank–Wolfe method , 2013, Mathematical Programming.

[34]  Martin Jaggi,et al.  Primal-Dual Rates and Certificates , 2016, ICML.

[35]  Pradeep Ravikumar,et al.  A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery , 2016, ICML.

[36]  Yi Zhou,et al.  Conditional Gradient Sliding for Convex Optimization , 2016, SIAM J. Optim..

[37]  Arkadi Nemirovski,et al.  Solving variational inequalities with monotone operators on domains given by Linear Minimization Oracles , 2013, Math. Program..

[38]  Nicolas Boumal,et al.  On the low-rank approach for semidefinite programs arising in synchronization and community detection , 2016, COLT.

[39]  Dustin G. Mixon,et al.  Clustering subgaussian mixtures by semidefinite programming , 2016, ArXiv.

[40]  Haipeng Luo,et al.  Variance-Reduced and Projection-Free Stochastic Optimization , 2016, ICML.

[41]  Volkan Cevher,et al.  Frank-Wolfe works for non-Lipschitz continuous gradient objectives: Scalable poisson phase retrieval , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[42]  Elad Hazan,et al.  Sublinear time algorithms for approximate semidefinite programming , 2016, Math. Program..

[43]  Volkan Cevher,et al.  Smooth Primal-Dual Coordinate Descent Algorithms for Nonsmooth Convex Optimization , 2017, NIPS.

[44]  Hong-Kun Xu Convergence Analysis of the Frank-Wolfe Algorithm and Its Generalization in Banach Spaces , 2017, 1710.07367.

[45]  Bruce W. Suter,et al.  From error bounds to the complexity of first-order descent methods for convex functions , 2015, Math. Program..

[46]  Arkadi Nemirovski,et al.  Decomposition Techniques for Bilinear Saddle Point Problems and Variational Inequalities with Affine Monotone Operators , 2015, J. Optim. Theory Appl..

[47]  Volkan Cevher,et al.  Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage , 2017, AISTATS.

[48]  Gunnar Rätsch,et al.  Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees , 2017, NIPS.

[49]  Tony Jebara,et al.  Frank-Wolfe Algorithms for Saddle Point Problems , 2016, AISTATS.

[50]  Yi Zhou,et al.  Conditional Accelerated Lazy Stochastic Gradient Descent , 2017, ICML.

[51]  Martin Jaggi,et al.  A Unified Optimization View on Generalized Matching Pursuit and Frank-Wolfe , 2017, AISTATS.

[52]  Yurii Nesterov,et al.  Complexity bounds for primal-dual methods minimizing the model of objective function , 2017, Mathematical Programming.

[53]  Gauthier Gidel,et al.  Frank-Wolfe Splitting via Augmented Lagrangian Method , 2018, AISTATS.

[54]  Hing Cheung So,et al.  Outlier-Robust Matrix Completion via $\ell _p$ -Minimization , 2018, IEEE Transactions on Signal Processing.

[55]  Volkan Cevher,et al.  A Smooth Primal-Dual Optimization Framework for Nonsmooth Composite Convex Minimization , 2015, SIAM J. Optim..

[56]  Shiqian Ma,et al.  On the Nonergodic Convergence Rate of an Inexact Augmented Lagrangian Framework for Composite Convex Programming , 2016, Math. Oper. Res..

[57]  LI andCHI-KWONG Extreme Vectors of Doubly Nonnegative Matrices , .