MOCCA: Mirrored Convex/Concave Optimization for Nonconvex Composite Functions

Many optimization problems arising in high-dimensional statistics decompose naturally into a sum of several terms, where the individual terms are relatively simple but the composite objective function can only be optimized with iterative algorithms. In this paper, we are interested in optimization problems of the form F(Kx) + G(x), where K is a fixed linear transformation, while F and G are functions that may be nonconvex and/or nondifferentiable. In particular, if either of the terms are nonconvex, existing alternating minimization techniques may fail to converge; other types of existing approaches may instead be unable to handle nondifferentiability. We propose the MOCCA (mirrored convex/concave) algorithm, a primal/dual optimization approach that takes a local convex approximation to each term at every iteration. Inspired by optimization problems arising in computed tomography (CT) imaging, this algorithm can handle a range of nonconvex composite optimization problems, and offers theoretical guarantees for convergence when the overall problem is approximately convex (that is, any concavity in one term is balanced out by convexity in the other term). Empirical results show fast convergence for several structured signal recovery problems.

[1]  Ivan W. Selesnick,et al.  Convex 1-D Total Variation Denoising with Non-convex Regularization , 2015, IEEE Signal Processing Letters.

[2]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[3]  Arindam Banerjee,et al.  Bregman Alternating Direction Method of Multipliers , 2013, NIPS.

[4]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[5]  J. Schlomka,et al.  Experimental feasibility of multi-energy photon-counting K-edge imaging in pre-clinical computed tomography , 2008, Physics in medicine and biology.

[6]  Zhi-Quan Luo,et al.  Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems , 2015, ICASSP.

[7]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[8]  Thomas Brox,et al.  iPiano: Inertial Proximal Algorithm for Nonconvex Optimization , 2014, SIAM J. Imaging Sci..

[9]  Nicholas A. Johnson,et al.  A Dynamic Programming Algorithm for the Fused Lasso and L 0-Segmentation , 2013 .

[10]  Yurii Nesterov,et al.  Gradient methods for minimizing composite functions , 2012, Mathematical Programming.

[11]  D. Hunter,et al.  Quantile Regression via an MM Algorithm , 2000 .

[12]  Heng Lian,et al.  Total variation, adaptive total variation and nonconvex smoothly clipped absolute deviation penalty for denoising blocky images , 2009, Pattern Recognit..

[13]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[14]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[15]  Chengwu Lu,et al.  TV+TV2 Regularization with Nonconvex Sparseness-Inducing Penalty for Image Restoration , 2014 .

[16]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.

[17]  Po-Ling Loh,et al.  High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity , 2011, NIPS.

[18]  Saeed Ghadimi,et al.  Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.

[19]  Zhi-Quan Luo,et al.  Parallel Direction Method of Multipliers , 2014, NIPS.

[20]  Hong-Kun Xu,et al.  Convergence of Bregman alternating direction method with multipliers for nonconvex composite problems , 2014, 1410.8625.

[21]  Tuomo Valkonen,et al.  A primal–dual hybrid gradient method for nonlinear operators with applications to MRI , 2013, 1309.5032.

[22]  Carlo Fischione,et al.  On the Convergence of Alternating Direction Lagrangian Methods for Nonconvex Structured Optimization Problems , 2014, IEEE Transactions on Control of Network Systems.

[23]  Po-Ling Loh,et al.  Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima , 2013, J. Mach. Learn. Res..

[24]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[25]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[26]  Antonin Chambolle,et al.  Diagonal preconditioning for first order primal-dual algorithms in convex optimization , 2011, 2011 International Conference on Computer Vision.

[27]  James M. Ortega,et al.  Iterative solution of nonlinear equations in several variables , 2014, Computer science and applied mathematics.

[28]  Bingsheng He,et al.  Convergence Analysis of Primal-Dual Algorithms for a Saddle-Point Problem: From Contraction Perspective , 2012, SIAM J. Imaging Sci..

[29]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[30]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[31]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[32]  Thomas Brox,et al.  On Iteratively Reweighted Algorithms for Nonsmooth Nonconvex Optimization in Computer Vision , 2015, SIAM J. Imaging Sci..

[33]  A. Chambolle,et al.  A remark on accelerated block coordinate descent for computing the proximity operators of a sum of convex functions , 2015 .

[34]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[35]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[36]  Antonin Chambolle,et al.  On Total Variation Minimization and Surface Evolution Using Parametric Maximum Flows , 2009, International Journal of Computer Vision.

[37]  Rick Chartrand,et al.  Exact Reconstruction of Sparse Signals via Nonconvex Minimization , 2007, IEEE Signal Processing Letters.

[38]  R. Chartrand,et al.  Constrained \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}${\rm T}p{\rm V}$\end{document} Minimization for Enhance , 2014, IEEE journal of translational engineering in health and medicine.

[39]  P. Shikhaliev,et al.  Photon counting spectral CT versus conventional CT: comparative evaluation for breast imaging application , 2011, Physics in medicine and biology.

[40]  Guoyin Li,et al.  Global Convergence of Splitting Methods for Nonconvex Composite Optimization , 2014, SIAM J. Optim..

[41]  Huan Li,et al.  Accelerated Proximal Gradient Methods for Nonconvex Programming , 2015, NIPS.

[42]  Alexander J. Smola,et al.  Trend Filtering on Graphs , 2014, J. Mach. Learn. Res..

[43]  Zongben Xu,et al.  Convergence of multi-block Bregman ADMM for nonconvex composite problems , 2015, Science China Information Sciences.

[44]  I. Selesnick,et al.  Convex fused lasso denoising with non-convex regularization and its use for pulse detection , 2015, 2015 IEEE Signal Processing in Medicine and Biology Symposium (SPMB).

[45]  Xiaochuan Pan,et al.  An algorithm for constrained one-step inversion of spectral CT data , 2015, Physics in medicine and biology.