论文信息 - Distributed Lasso for in-network linear regression

Distributed Lasso for in-network linear regression

The least-absolute shrinkage and selection operator (Lasso) is a popular tool for joint estimation and continuous variable selection, especially well-suited for the under-determined but sparse linear regression problems. This paper develops an algorithm to estimate the regression coefficients via Lasso when the training data is distributed across different agents, and their communication to a central processing unit is prohibited for e.g., communication cost or privacy reasons. The novel distributed algorithm is obtained after reformulating the Lasso into a separable form, which is iteratively minimized using the alternating-direction method of multipliers so as to gain the desired degree of parallelization. The per agent estimate updates are given by simple soft-thresholding operations, and inter-agent communication overhead remains at affordable level. Without exchanging elements from the different training sets, the local estimates provably consent to the global Lasso solution, i.e., the fit that would be obtained if the entire data set were centrally available. Numerical experiments corroborate the convergence and global optimality of the proposed distributed scheme.

[1] Gonzalo Mateos,et al. Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.

[2] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .

[3] Klaus Nordhausen,et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[4] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[5] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[6] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[7] Tom Goldstein,et al. The Split Bregman Method for L1-Regularized Problems , 2009, SIAM J. Imaging Sci..

[8] Chris Clifton,et al. Tools for privacy preserving distributed data mining , 2002, SKDD.

[9] Georgios B. Giannakis,et al. Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity , 2010, IEEE Transactions on Signal Processing.

[10] R. Tibshirani,et al. PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[11] Reza Olfati-Saber,et al. Consensus and Cooperation in Networked Multi-Agent Systems , 2007, Proceedings of the IEEE.

[12] Stephen J. Wright,et al. Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.