Selective Linearization For Multi-Block Convex Optimization

We consider the problem of minimizing a sum of several convex non-smooth functions. We introduce a new algorithm called the selective linearization method, which iteratively linearizes all but one of the functions and employs simple proximal steps. The algorithm is a form of multiple operator splitting in which the order of processing partial functions is not fixed, but rather determined in the course of calculations. Global convergence is proved and estimates of the convergence rate are derived. Specifically, the number of iterations needed to achieve solution accuracy $\varepsilon$ is of order $\mathcal{O}\big(\ln(1/\varepsilon)/\varepsilon\big)$. We also illustrate the operation of the algorithm on structured regularization problems.

[1]  H. H. Rachford,et al.  On the numerical solution of heat conduction problems in two and three space variables , 1956 .

[2]  Andrzej Ruszczynski,et al.  A regularized decomposition method for minimizing a sum of polyhedral functions , 1986, Math. Program..

[3]  Julien Mairal,et al.  Proximal Methods for Hierarchical Sparse Coding , 2010, J. Mach. Learn. Res..

[4]  Yurii Nesterov,et al.  New variants of bundle methods , 1995, Math. Program..

[5]  Bingsheng He,et al.  On the convergence rate of Douglas–Rachford operator splitting method , 2015, Math. Program..

[6]  R. Glowinski,et al.  Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics , 1987 .

[7]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[8]  R. Mifflin A modification and an extension of Lemarechal’s algorithm for nonsmooth minimization , 1982 .

[9]  A. Ruszczynski,et al.  Nonlinear Optimization , 2006 .

[10]  Darinka Dentcheva,et al.  An augmented Lagrangian method for distributed optimization , 2015, Math. Program..

[11]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[12]  H. H. Rachford,et al.  The Numerical Solution of Parabolic and Elliptic Differential Equations , 1955 .

[13]  Yaoliang Yu,et al.  Better Approximation and Faster Algorithm Using the Proximal Average , 2013, NIPS.

[14]  Andrzej Ruszczynski,et al.  Proximal Decomposition Via Alternating Linearization , 1999, SIAM J. Optim..

[15]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[16]  Zhi-Quan Luo,et al.  Parallel Direction Method of Multipliers , 2014, NIPS.

[17]  P. Lions,et al.  Splitting Algorithms for the Sum of Two Nonlinear Operators , 1979 .

[18]  Xiaodong Lin,et al.  Alternating linearization for structured regularization problems , 2011, J. Mach. Learn. Res..

[19]  P. L. Combettes,et al.  Iterative construction of the resolvent of a sum of maximal monotone operators , 2009 .

[20]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[21]  Jonathan Eckstein,et al.  A Simplified Form of Block-Iterative Operator Splitting and an Asynchronous Algorithm Resembling the Multi-Block Alternating Direction Method of Multipliers , 2017, J. Optim. Theory Appl..

[22]  Dimitri P. Bertsekas,et al.  Incremental Aggregated Proximal and Augmented Lagrangian Algorithms , 2015, ArXiv.

[23]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[24]  Krzysztof C. Kiwiel,et al.  Proximal level bundle methods for convex nondifferentiable optimization, saddle-point problems and variational inequalities , 1995, Math. Program..

[25]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[26]  Laurent Condat,et al.  A Primal–Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms , 2012, Journal of Optimization Theory and Applications.

[27]  Guanghui Lan,et al.  Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization , 2013, Mathematical Programming.

[28]  Junzhou Huang,et al.  Composite splitting algorithms for convex optimization , 2011, Comput. Vis. Image Underst..

[29]  C. Lemaréchal Nonsmooth Optimization and Descent Methods , 1978 .

[30]  Zhi-Quan Luo,et al.  On the linear convergence of the alternating direction method of multipliers , 2012, Mathematical Programming.

[31]  Lorenzo Rosasco,et al.  Proximal methods for the latent group lasso penalty , 2012, Computational Optimization and Applications.

[32]  R. Tyrrell Rockafellar,et al.  Scenarios and Policy Aggregation in Optimization Under Uncertainty , 1991, Math. Oper. Res..

[33]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[34]  HeBingsheng,et al.  The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , 2016 .

[35]  Jonathan Eckstein,et al.  Understanding the Convergence of the Alternating Direction Method of Multipliers: Theoretical and Computational Perspectives , 2015 .

[36]  Benar Fux Svaiter,et al.  General Projective Splitting Methods for Sums of Maximal Monotone Operators , 2009, SIAM J. Control. Optim..

[37]  Xiaohui Xie,et al.  Split Bregman method for large scale fused Lasso , 2010, Comput. Stat. Data Anal..

[38]  R. Tibshirani,et al.  Spatial smoothing and hot spot detection for CGH data using the fused lasso. , 2008, Biostatistics.

[39]  Tom Goldstein,et al.  The Split Bregman Method for L1-Regularized Problems , 2009, SIAM J. Imaging Sci..

[40]  Shiqian Ma,et al.  Fast alternating linearization methods for minimizing the sum of two convex functions , 2009, Math. Program..

[41]  Bang Công Vu,et al.  A splitting algorithm for dual monotone inclusions involving cocoercive operators , 2011, Advances in Computational Mathematics.

[42]  Yu Du,et al.  Rate of Convergence of the Bundle Method , 2016, J. Optim. Theory Appl..

[43]  Shiqian Ma,et al.  On the Global Linear Convergence of the ADMM with MultiBlock Variables , 2014, SIAM J. Optim..

[44]  Krzysztof C. Kiwiel,et al.  An aggregate subgradient method for nonsmooth convex minimization , 1983, Math. Program..

[45]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[46]  Jean Charles Gilbert,et al.  Numerical Optimization: Theoretical and Practical Aspects , 2003 .

[47]  Wotao Yin,et al.  Parallel Multi-Block ADMM with o(1 / k) Convergence , 2013, Journal of Scientific Computing.

[48]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[49]  A. Banerjee Convex Analysis and Optimization , 2006 .

[50]  Xi Chen,et al.  Smoothing proximal gradient method for general structured sparse regression , 2010, The Annals of Applied Statistics.

[51]  K. Kiwiel Methods of Descent for Nondifferentiable Optimization , 1985 .

[52]  P. L. Combettes,et al.  Primal-Dual Splitting Algorithm for Solving Inclusions with Mixtures of Composite, Lipschitzian, and Parallel-Sum Type Monotone Operators , 2011, Set-Valued and Variational Analysis.

[53]  R. Glowinski,et al.  Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires , 1975 .