Selective linearization for multi-block statistical learning

Abstract We consider the problem of minimizing a sum of several convex non-smooth functions and discuss the selective linearization method (SLIN), which iteratively linearizes all but one of the functions and employs simple proximal steps. The algorithm is a form of multiple operator splitting in which the order of processing partial functions is not fixed, but rather determined in the course of calculations. SLIN is globally convergent for an arbitrary number of component functions without artificial duplication of variables. We report results from extensive numerical experiments in two statistical learning settings such as large-scale overlapping group Lasso and doubly regularized support vector machine. In each setting, we introduce novel and efficient solutions for solving sub-problems. The numerical results demonstrate the efficacy and accuracy of SLIN.

[1]  Jieping Ye,et al.  Efficient Methods for Overlapping Group Lasso , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Yu Du,et al.  Rate of Convergence of the Bundle Method , 2016, J. Optim. Theory Appl..

[3]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[4]  Darinka Dentcheva,et al.  An augmented Lagrangian method for distributed optimization , 2015, Math. Program..

[5]  Lorenzo Rosasco,et al.  Proximal methods for the latent group lasso penalty , 2012, Computational Optimization and Applications.

[6]  Wotao Yin,et al.  Global Convergence of ADMM in Nonconvex Nonsmooth Optimization , 2015, Journal of Scientific Computing.

[7]  Xiaodong Lin,et al.  Alternating linearization for structured regularization problems , 2011, J. Mach. Learn. Res..

[8]  Shiqian Ma,et al.  GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization , 2017, ICML.

[9]  Xiaodong Lin,et al.  A Selective Linearization Method For Multiblock Convex Optimization , 2017, SIAM J. Optim..

[10]  Xiaohui Xie,et al.  Efficient variable selection in support vector machines via the alternating direction method of multipliers , 2011, AISTATS.

[11]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[12]  Liang Zhao,et al.  Multi-convex Inequality-constrained Alternating Direction Method of Multipliers , 2019 .

[13]  Xiaonan Li,et al.  Operations research and data mining , 2008, Eur. J. Oper. Res..

[14]  Sheng-Hsun Hsu,et al.  Application of SVM and ANN for image retrieval , 2006, Eur. J. Oper. Res..

[15]  Shiqian Ma,et al.  Fast alternating linearization methods for minimizing the sum of two convex functions , 2009, Math. Program..

[16]  Xi Chen,et al.  Smoothing proximal gradient method for general structured sparse regression , 2010, The Annals of Applied Statistics.

[17]  Yiu-ming Cheung,et al.  Proximal average approximated incremental gradient descent for composite penalty regularized empirical risk minimization , 2016, Machine Learning.

[18]  Florentina Bunea,et al.  The Group Square-Root Lasso: Theoretical Properties and Fast Algorithms , 2013, IEEE Transactions on Information Theory.

[19]  Zhi-Quan Luo,et al.  Parallel Direction Method of Multipliers , 2014, NIPS.

[20]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[21]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..

[22]  Wotao Yin,et al.  Parallel Multi-Block ADMM with o(1 / k) Convergence , 2013, Journal of Scientific Computing.

[23]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[24]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[25]  Shiqian Ma,et al.  Solving Multiple-Block Separable Convex Minimization Problems Using Two-Block Alternating Direction Method of Multipliers , 2013, ArXiv.

[26]  A. Ruszczynski,et al.  Nonlinear Optimization , 2006 .

[27]  R. Rockafellar Monotone Operators and the Proximal Point Algorithm , 1976 .

[28]  Yaoliang Yu,et al.  Better Approximation and Faster Algorithm Using the Proximal Average , 2013, NIPS.

[29]  Andrzej Ruszczynski,et al.  Proximal Decomposition Via Alternating Linearization , 1999, SIAM J. Optim..