Decentralized Sparsity-Regularized Rank Minimization: Algorithms and Applications

Given a limited number of entries from the superposition of a low-rank matrix plus the product of a known compression matrix times a sparse matrix, recovery of the low-rank and sparse components is a fundamental task subsuming compressed sensing, matrix completion, and principal components pursuit. This paper develops algorithms for decentralized sparsity-regularized rank minimization over networks, when the nuclear- and ℓ1-norm are used as surrogates to the rank and nonzero entry counts of the sought matrices, respectively. While nuclear-norm minimization has well-documented merits when centralized processing is viable, non-separability of the singular-value sum challenges its decentralized minimization. To overcome this limitation, leveraging an alternative characterization of the nuclear norm yields a separable, yet non-convex cost minimized via the alternating-direction method of multipliers. Interestingly, if the decentralized (non-convex) estimator converges, under certain conditions it provably attains the global optimum of its centralized counterpart. As a result, this paper bridges the performance gap between centralized and in-network decentralized, sparsity-regularized rank minimization. This, in turn, facilitates (stable) recovery of the low rank and sparse model matrices through reduced-complexity per-node computations, and affordable message passing among single-hop neighbors. Several application domains are outlined to highlight the generality and impact of the proposed framework. These include unveiling traffic anomalies in backbone networks, and predicting networkwide path latencies. Simulations with synthetic and real network data confirm the convergence of the novel decentralized algorithm, and its centralized performance guarantees.

[1]  Yonina C. Eldar,et al.  C-HiLasso: A Collaborative Hierarchical Sparse Modeling Framework , 2010, IEEE Transactions on Signal Processing.

[2]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[3]  Ioannis D. Schizas,et al.  Distributed LMS for Consensus-Based In-Network Adaptive Processing , 2009, IEEE Transactions on Signal Processing.

[4]  Marina Thottan,et al.  Anomaly detection in IP networks , 2003, IEEE Trans. Signal Process..

[5]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[6]  Alejandro Ribeiro,et al.  Consensus in Ad Hoc WSNs With Noisy Links—Part I: Distributed Estimation of Deterministic Signals , 2008, IEEE Transactions on Signal Processing.

[7]  A. Montanari,et al.  On positioning via distributed matrix completion , 2010, 2010 IEEE Sensor Array and Multichannel Signal Processing Workshop.

[8]  Joel A. Tropp,et al.  Just relax: convex programming methods for identifying sparse signals in noise , 2006, IEEE Transactions on Information Theory.

[9]  Renato D. C. Monteiro,et al.  Digital Object Identifier (DOI) 10.1007/s10107-004-0564-1 , 2004 .

[10]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[11]  Konstantina Papagiannaki,et al.  Structural analysis of network traffic flows , 2004, SIGMETRICS '04/Performance '04.

[12]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[13]  Pierre Geurts,et al.  DMFSGD: A Decentralized Matrix Factorization Algorithm for Network Distance Prediction , 2012, IEEE/ACM Transactions on Networking.

[14]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[15]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[16]  Adi Shraibman,et al.  Rank, Trace-Norm and Max-Norm , 2005, COLT.

[17]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[18]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[19]  J. Walrand,et al.  Distributed Dynamic Programming , 2022 .

[20]  Zhi-Quan Luo,et al.  Distributed Estimation Using Reduced-Dimensionality Sensor Observations , 2005, IEEE Transactions on Signal Processing.

[21]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[22]  Morteza Mardani,et al.  Dynamic Anomalography: Tracking Network Anomalies Via Sparsity and Low Rank , 2012, IEEE Journal of Selected Topics in Signal Processing.

[23]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[24]  Dima Grigoriev,et al.  Complexity of Quantifier Elimination in the Theory of Algebraically Closed Fields , 1984, MFCS.

[25]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[26]  Gonzalo Mateos,et al.  Distributed Sparse Linear Regression , 2010, IEEE Transactions on Signal Processing.

[27]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[28]  Georgios B. Giannakis,et al.  Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity , 2010, IEEE Transactions on Signal Processing.

[29]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[30]  Walter Willinger,et al.  Spatio-temporal compressive sensing and internet traffic matrices , 2009, SIGCOMM '09.

[31]  B GiannakisGeorgios,et al.  Distributed sparse linear regression , 2010 .

[32]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[33]  WillingerWalter,et al.  Spatio-temporal compressive sensing and internet traffic matrices , 2009 .

[34]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[35]  Sergios Theodoridis,et al.  A Sparsity Promoting Adaptive Algorithm for Distributed Learning , 2012, IEEE Transactions on Signal Processing.

[36]  Wei Wang,et al.  Robust traffic anomaly detection with principal component pursuit , 2010, CoNEXT '10 Student Workshop.

[37]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[38]  Morteza Mardani,et al.  Recovery of Low-Rank Plus Compressed Sparse Matrices With Application to Unveiling Traffic Anomalies , 2012, IEEE Transactions on Information Theory.

[39]  Gonzalo Mateos,et al.  Distributed Recursive Least-Squares: Stability and Performance Analysis , 2011, IEEE Transactions on Signal Processing.

[40]  Xiaodong Li,et al.  Stable Principal Component Pursuit , 2010, 2010 IEEE International Symposium on Information Theory.

[41]  Stephen P. Boyd,et al.  A rank minimization heuristic with application to minimum order system approximation , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[42]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[43]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[44]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[45]  José M. F. Moura,et al.  Cooperative Convex Optimization in Networked Systems: Augmented Lagrangian Algorithms With Directed Gossip Communication , 2010, IEEE Transactions on Signal Processing.

[46]  P. L. Combettes,et al.  A proximal decomposition method for solving convex variational inverse problems , 2008, 0807.2617.

[47]  Christopher Ré,et al.  Parallel stochastic gradient algorithms for large-scale matrix completion , 2013, Mathematical Programming Computation.

[48]  Francis R. Bach,et al.  A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization , 2008, J. Mach. Learn. Res..

[49]  Sergios Theodoridis,et al.  A Sparsity-Aware Adaptive Algorithm for Distributed Learning , 2011, arXiv.org.