An Iteratively Re-weighted Method for Problems with Sparsity-Inducing Norms

This work aims at solving the problems with intractable sparsity-inducing norms that are often encountered in various machine learning tasks, such as multi-task learning, subspace clustering, feature selection, robust principal component analysis, and so on. Specifically, an Iteratively Re-Weighted method (IRW) with solid convergence guarantee is provided. We investigate its convergence speed via numerous experiments on real data. Furthermore, in order to validate the practicality of IRW, we use it to solve a concrete robust feature selection model with complicated objective function. The experimental results show that the model coupled with proposed optimization method outperforms alternative methods significantly.

[1]  Feiping Nie,et al.  Low-Rank Matrix Recovery via Efficient Schatten p-Norm Minimization , 2012, AAAI.

[2]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[3]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[4]  Xuelong Li,et al.  Matrix completion by Truncated Nuclear Norm Regularization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Feiping Nie,et al.  Joint Capped Norms Minimization for Robust Matrix Recovery , 2017, IJCAI.

[6]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[8]  S. P. Fodor DNA SEQUENCING: Massively Parallel Genomics , 1997, Science.

[9]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression (PIE) database , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[10]  R. Abseher,et al.  Microarray gene expression profiling of B-cell chronic lymphocytic leukemia subgroups defined by genomic aberrations and VH mutation status. , 2004, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[11]  James T. Kwok,et al.  Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity , 2016, ICML.

[12]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[13]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.

[14]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[15]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[16]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[17]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[18]  E. Petricoin,et al.  Serum proteomic patterns for detection of prostate cancer. , 2002, Journal of the National Cancer Institute.

[19]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[20]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[21]  Jianzhong Li,et al.  A stable gene selection in microarray data analysis , 2006, BMC Bioinformatics.

[22]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[23]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[24]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[25]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[26]  Prabhu Babu,et al.  Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning , 2017, IEEE Transactions on Signal Processing.

[27]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[28]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[29]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[30]  Huan Li,et al.  Accelerated Proximal Gradient Methods for Nonconvex Programming , 2015, NIPS.

[31]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[32]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[33]  P. Sebastiani,et al.  Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer , 2007, Nature Medicine.

[34]  Axel Ruhe Perturbation bounds for means of eigenvalues and invariant subspaces , 1970 .

[35]  Jieping Ye,et al.  A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems , 2013, ICML.

[36]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[37]  Shuicheng Yan,et al.  Generalized Nonconvex Nonsmooth Low-Rank Minimization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Paul Tseng,et al.  Trace Norm Regularization: Reformulations, Algorithms, and Multi-Task Learning , 2010, SIAM J. Optim..

[39]  Xuelong Li,et al.  Calibrated Multi-Task Learning , 2018, KDD.

[40]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.

[41]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[42]  Ehsan Elhamifar,et al.  Sparse subspace clustering , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Maryam Fazel,et al.  Iterative reweighted algorithms for matrix rank minimization , 2012, J. Mach. Learn. Res..

[44]  Radu Ioan Bot,et al.  An inertial forward–backward algorithm for the minimization of the sum of two nonconvex functions , 2014, EURO J. Comput. Optim..

[45]  E. Lander,et al.  MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia , 2002, Nature Genetics.