On Convergence of the Maximum Block Improvement Method

The MBI (maximum block improvement) method is a greedy approach to solving optimization problems where the decision variables can be grouped into a finite number of blocks. Assuming that optimizing over one block of variables while fixing all others is relatively easy, the MBI method updates the block of variables corresponding to the maximally improving block at each iteration, which is arguably a most natural and simple process to tackle block-structured problems with great potentials for engineering applications. In this paper we establish global and local linear convergence results for this method. The global convergence is established under the Łojasiewicz inequality assumption, while the local analysis invokes second-order assumptions. We study in particular the tensor optimization model with spherical constraints. Conditions for linear convergence of the famous power method for computing the maximum eigenvalue of a matrix follow in this framework as a special case. The condition is interpreted in v...

[1]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[2]  Johannes Weissinger,et al.  Verallgemeinerungen des Seidelschen Iterationsverfahrens. Herrn R. v. Mises zum 70. Geburtstag gewidmet , 1953 .

[3]  P. Oswald,et al.  Greedy and Randomized Versions of the Multiplicative Schwarz Method , 2012 .

[4]  Antonio Falcó,et al.  Proper generalized decomposition for nonlinear convex problems in tensor Banach spaces , 2011, Numerische Mathematik.

[5]  Lek-Heng Lim,et al.  Singular values and eigenvalues of tensors: a variational approach , 2005, 1st IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, 2005..

[6]  Ion Necoara Suboptimal distributed MPC based on a block-coordinate descent method with feasibility and stability guarantees , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[7]  Adrian S. Lewis,et al.  THEINEQUALITY FOR NONSMOOTH SUBANALYTIC FUNCTIONS WITH APPLICATIONS TO , 2007 .

[8]  K. Kurdyka On gradients of functions definable in o-minimal structures , 1998 .

[9]  J. Ortega,et al.  Nonlinear Difference Equations and Gauss-Seidel Type Iterative Methods , 1966 .

[10]  W. Hackbusch Tensor Spaces and Numerical Tensor Calculus , 2012, Springer Series in Computational Mathematics.

[11]  R. V. Southwell Relaxation Methods In Engineering Science - A Treatise On Approximate Computation , 2010 .

[12]  Gene H. Golub,et al.  Rank-One Approximation to High Order Tensors , 2001, SIAM J. Matrix Anal. Appl..

[13]  P. Comon,et al.  A polynomial based approach to extract the maxima of an antipodally symmetric spherical function and its application to extract fiber directions from the Orientation Distribution Function in Diffusion MRI , 2008 .

[14]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[15]  Eugene E. Tyrtyshnikov,et al.  Tucker Dimensionality Reduction of Three-Dimensional Arrays in Linear Time , 2008, SIAM J. Matrix Anal. Appl..

[16]  Liqun Qi,et al.  Semismoothness of the maximum eigenvalue function of a symmetric tensor and its application , 2013 .

[17]  Hédy Attouch,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality , 2008, Math. Oper. Res..

[18]  Zhening Li,et al.  Approximation Methods for Polynomial Optimization: Models, Algorithms, and Applications , 2012 .

[19]  M. J. D. Powell,et al.  On search directions for minimization algorithms , 1973, Math. Program..

[20]  Eugene E. Tyrtyshnikov,et al.  Breaking the Curse of Dimensionality, Or How to Use SVD in Many Dimensions , 2009, SIAM J. Sci. Comput..

[21]  Virginie Ehrlacher,et al.  Convergence of a greedy algorithm for high-dimensional convex nonlinear problems , 2010, 1004.0095.

[22]  Liqi Wang,et al.  On the Global Convergence of the Alternating Least Squares Method for Rank-One Approximation to Generic Tensors , 2014, SIAM J. Matrix Anal. Appl..

[23]  Yin Zhang,et al.  Accelerating the Lee-Seung Algorithm for Nonnegative Matrix Factorization , 2005 .

[24]  S. V. Dolgov,et al.  ALTERNATING MINIMAL ENERGY METHODS FOR LINEAR SYSTEMS IN HIGHER DIMENSIONS∗ , 2014 .

[25]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[26]  Shuzhong Zhang,et al.  Approximation algorithms for homogeneous polynomial optimization with quadratic constraints , 2010, Math. Program..

[27]  Shuzhong Zhang,et al.  Maximum Block Improvement and Polynomial Optimization , 2012, SIAM J. Optim..

[28]  Robert E. Mahony,et al.  Convergence of the Iterates of Descent Methods for Analytic Cost Functions , 2005, SIAM J. Optim..

[29]  Chen Ling,et al.  The Best Rank-1 Approximation of a Symmetric Tensor and Related Spherical Optimization Problems , 2012, SIAM J. Matrix Anal. Appl..

[30]  Luigi Grippo,et al.  On the convergence of the block nonlinear Gauss-Seidel method under convex constraints , 2000, Oper. Res. Lett..

[31]  Vin de Silva,et al.  Tensor rank and the ill-posedness of the best low-rank approximation problem , 2006, math/0607647.

[32]  Joseph K. Bradley,et al.  Parallel Coordinate Descent for L1-Regularized Loss Minimization , 2011, ICML.

[33]  James M. Ortega,et al.  Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[34]  Bart Vandereycken,et al.  The geometry of algorithms using hierarchical tensors , 2013, Linear Algebra and its Applications.

[35]  Yinchu Zhu,et al.  Breaking the curse of dimensionality in regression , 2017, ArXiv.

[36]  Adrian S. Lewis,et al.  Randomized Methods for Linear Constraints: Convergence Rates and Conditioning , 2008, Math. Oper. Res..

[37]  Daniel Kressner,et al.  A literature survey of low‐rank tensor approximation techniques , 2013, 1302.7121.

[38]  Peter Richt,et al.  Distributed Coordinate Descent Method for Learning with Big Data , 2016 .

[39]  S. Schechter ITERATION METHODS FOR NONLINEAR PROBLEMS , 1962 .

[40]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[41]  Dragos N. Clipici,et al.  Distributed coordinate descent methods for composite minimization , 2013 .

[42]  P. Tseng,et al.  On the convergence of the coordinate descent method for convex differentiable minimization , 1992 .

[43]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[44]  Reinhold Schneider,et al.  The Alternating Linear Scheme for Tensor Optimization in the Tensor Train Format , 2012, SIAM J. Sci. Comput..

[45]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[46]  Peter Richtárik,et al.  Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.

[47]  Markus Weimar Breaking the curse of dimensionality , 2015 .

[48]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[49]  G. TEMPLE,et al.  Relaxation Methods in Engineering Science , 1942, Nature.

[50]  André Uschmajew,et al.  Local Convergence of the Alternating Least Squares Algorithm for Canonical Tensor Approximation , 2012, SIAM J. Matrix Anal. Appl..

[51]  Peter Richtárik,et al.  Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.