Inertial Block Mirror Descent Method for Non-Convex Non-Smooth Optimization

In this paper, we propose inertial versions of block coordinate descent methods for solving non-convex non-smooth composite optimization problems. We use the general framework of Bregman distance functions to compute the proximal maps. Our methods not only allow using two different extrapolation points to evaluate gradients and adding the inertial force, but also take advantage of randomly picking the block of variables to update. Moreover, our methods do not require a restarting step, and as such, it is not a monotonically decreasing method. To prove the convergence of the whole generated sequence to a critical point, we modify the convergence proof recipe of Bolte, Sabach and Teboulle (Proximal alternating linearized minimization for non-convex and non-smooth problems, Math.\@ Prog. 146(1):459--494, 2014), and combine it with auxiliary functions. We deploy the proposed methods to solve non-negative matrix factorization (NMF) and show that they compete favourably with the state-of-the-art NMF algorithms.

[1]  Amir Beck,et al.  On the Convergence of Block Coordinate Descent Type Methods , 2013, SIAM J. Optim..

[2]  Nicolas Gillis,et al.  Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization , 2011, Neural Computation.

[3]  Haesun Park,et al.  Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework , 2014, J. Glob. Optim..

[4]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[5]  Nicolas Gillis,et al.  Hierarchical Clustering of Hyperspectral Images Using Rank-Two Nonnegative Matrix Factorization , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Peter Richtárik,et al.  Accelerated, Parallel, and Proximal Coordinate Descent , 2013, SIAM J. Optim..

[7]  Nicolas Gillis,et al.  The Why and How of Nonnegative Matrix Factorization , 2014, ArXiv.

[8]  Boris Polyak Some methods of speeding up the convergence of iteration methods , 1964 .

[9]  Marc Teboulle,et al.  First Order Methods beyond Convexity and Lipschitz Gradient Continuity with Applications to Quadratic Inverse Problems , 2017, SIAM J. Optim..

[10]  Masoud Ahookhosh,et al.  Bregman forward-backward splitting for nonconvex composite optimization: superlinear convergence to nonisolated critical points , 2019, 1905.11904.

[11]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[12]  Wotao Yin,et al.  A Globally Convergent Algorithm for Nonconvex Optimization Based on Block Coordinate Update , 2014, J. Sci. Comput..

[13]  Nicolas Gillis,et al.  Accelerating Nonnegative Matrix Factorization Algorithms Using Extrapolation , 2018, Neural Computation.

[14]  Marc Teboulle,et al.  Convergence Analysis of a Proximal-Like Minimization Algorithm Using Bregman Functions , 1993, SIAM J. Optim..

[15]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[16]  Marc Teboulle,et al.  Convergence of Proximal-Like Algorithms , 1997, SIAM J. Optim..

[17]  Adrian S. Lewis,et al.  The [barred L]ojasiewicz Inequality for Nonsmooth Subanalytic Functions with Applications to Subgradient Dynamical Systems , 2006, SIAM J. Optim..

[18]  K. Kurdyka On gradients of functions definable in o-minimal structures , 1998 .

[19]  Jonathan Eckstein,et al.  Nonlinear Proximal Point Algorithms Using Bregman Functions, with Applications to Convex Programming , 1993, Math. Oper. Res..

[20]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[21]  M. J. D. Powell,et al.  On search directions for minimization algorithms , 1973, Math. Program..

[22]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[23]  S. K. Zavriev,et al.  Heavy-ball method in nonconvex optimization problems , 1993 .

[24]  Alvaro R. De Pierro,et al.  Re-examination of Bregman functions and new properties of their divergences , 2018, Optimization.

[25]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[26]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[27]  Marc Teboulle,et al.  A Descent Lemma Beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications , 2017, Math. Oper. Res..

[28]  Peter Ochs,et al.  Unifying Abstract Inexact Convergence Theorems and Block Coordinate Variable Metric iPiano , 2016, SIAM J. Optim..

[29]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[30]  Marc Teboulle,et al.  A simplified view of first order methods for optimization , 2018, Math. Program..

[31]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[32]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[33]  Marc Teboulle,et al.  Interior Gradient and Proximal Methods for Convex and Conic Optimization , 2006, SIAM J. Optim..

[34]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[35]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[36]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[37]  Radu Ioan Bot,et al.  An Inertial Tseng’s Type Proximal Algorithm for Nonsmooth and Nonconvex Optimization Problems , 2014, J. Optim. Theory Appl..

[38]  E. M. L. Beale,et al.  Nonlinear Programming: A Unified Approach. , 1970 .

[39]  Thomas Pock,et al.  Inertial Proximal Alternating Linearized Minimization (iPALM) for Nonconvex and Nonsmooth Problems , 2016, SIAM J. Imaging Sci..

[40]  Hédy Attouch,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality , 2008, Math. Oper. Res..

[41]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[42]  Luigi Grippo,et al.  On the convergence of the block nonlinear Gauss-Seidel method under convex constraints , 2000, Oper. Res. Lett..

[43]  A. Auslender Asymptotic properties of the fenchel dual functional and applications to decomposition problems , 1992 .

[44]  Wotao Yin,et al.  A fast patch-dictionary method for whole image recovery , 2014, ArXiv.

[45]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[46]  Thomas Brox,et al.  iPiano: Inertial Proximal Algorithm for Nonconvex Optimization , 2014, SIAM J. Imaging Sci..

[47]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.

[48]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[49]  Marie-Françoise Roy,et al.  Real algebraic geometry , 1992 .

[50]  Clifford Hildreth,et al.  A quadratic programming procedure , 1957 .

[51]  Benar Fux Svaiter,et al.  Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , 2013, Math. Program..