Sparse Network Lasso for Local High-dimensional Regression

We introduce the sparse network lasso, which is suited for interpreting models in addition to having high predicting power, for high dimensionality d and small sample size n types of problems. More specifically, we consider a function that consists of local models, where each local model is sparse. We introduce sample-wise network regularization and sample-wise exclusive group sparsity (a.k.a., l1,2 norm) to introduce diversity into the local models, with different chosen feature sets interpreted as different local models. This would help to interpret not only features but also local models (i.e., samples) in practical problems. The proposed method is a convex method, and thus, it can find a globally optimal solution. Moreover, we propose a simple yet efficient iterative least-squares based optimization procedure for the sparse network lasso, which does not need a tuning parameter, and is guaranteed to converge to a globally optimal solution. The solution is empirically shown to outperform straightforward alternatives in predicting outputs for both simulated and molecular biological personalized medicine data.

[1]  Wen Gao,et al.  Efficient Generalized Fused Lasso and its Application to the Diagnosis of Alzheimer's Disease , 2014, AAAI.

[2]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[3]  Michael I. Jordan,et al.  A General Analysis of the Convergence of ADMM , 2015, ICML.

[4]  R. Shoemaker The NCI60 human tumour cell line anticancer drug screen , 2006, Nature Reviews Cancer.

[5]  Huan Liu,et al.  Feature Selection with Linked Data in Social Media , 2012, SDM.

[6]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[7]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[8]  Rok Sosic,et al.  SnapVX: A Network-Based Convex Optimization Solver , 2017, J. Mach. Learn. Res..

[9]  Samuel Kaski,et al.  Bayesian Multi-view Tensor Factorization , 2014, ECML/PKDD.

[10]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[11]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[12]  Koh Takeuchi,et al.  Higher Order Fused Regularization for Supervised Learning with Grouped Parameters , 2015, ECML/PKDD.

[13]  Trevor J. Hastie,et al.  Genome-wide association analysis by lasso penalized logistic regression , 2009, Bioinform..

[14]  Masashi Sugiyama,et al.  Dual-Augmented Lagrangian Method for Efficient Sparse Reconstruction , 2009, IEEE Signal Processing Letters.

[15]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[16]  Xiaoning Qian,et al.  A Scalable Algorithm for Structured Kernel Feature Selection , 2015, AISTATS.

[17]  Masashi Sugiyama,et al.  Super-Linear Convergence of Dual Augmented Lagrangian Algorithm for Sparsity Regularized Estimation , 2009, J. Mach. Learn. Res..

[18]  Liva Ralaivola,et al.  Multiple indefinite kernel learning with mixed norm regularization , 2009, ICML '09.

[19]  Rong Jin,et al.  Exclusive Lasso for Multi-task Feature Selection , 2010, AISTATS.

[20]  Feiping Nie,et al.  Exclusive Feature Learning on Arbitrary Structures via \ell_{1, 2}-norm , 2014, NIPS.

[21]  Stephen P. Boyd,et al.  Network Lasso: Clustering and Optimization in Large Graphs , 2015, KDD.