Fast Algorithms for Optimal Link Selection in Large-Scale Network Monitoring

The robustness and integrity of IP networks require efficient tools for traffic monitoring and analysis, which scale well with traffic volume and network size. We address the problem of optimal large-scale monitoring of computer networks under resource constraints. Specifically, we consider the task of selecting the “best” subset of at most K links to monitor, so as to optimally predict the traffic load at the remaining ones. Our notion of optimality is quantified in terms of the statistical error of network traffic predictors. The optimal monitoring problem at hand is akin to certain combinatorial constraints, which render the algorithms seeking the exact solution impractical. We develop a number of fast algorithms that improve upon existing algorithms in terms of computational complexity and accuracy. Our algorithms exploit the geometry of principal component analysis, which also leads us to new types of theoretical bounds on the prediction error. Finally, these algorithms are amenable to randomization, where the best of several parallel independent instances often yields the exact optimal solution. Their performance is illustrated and evaluated on simulated and real-network traces.

[1]  Björn E. Ottersten,et al.  Semidefinite Relaxations of Robust Binary Least Squares Under Ellipsoidal Uncertainty Sets , 2011, IEEE Transactions on Signal Processing.

[2]  Guillaume Sagnol,et al.  Computing Optimal Designs of multiresponse Experiments reduces to Second-Order Cone Programming , 2009, 0912.5467.

[3]  Guillaume Sagnol,et al.  Optimal monitoring in large networks by Successive c-optimal Designs , 2010, 2010 22nd International Teletraffic Congress (lTC 22).

[4]  Joel A. Tropp,et al.  Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit , 2006, Signal Process..

[5]  George Michailidis,et al.  Network-wide Statistical Modeling and Prediction of Computer Traffic , 2010, 1005.4641.

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  S. Muthukrishnan,et al.  Approximation of functions over redundant dictionaries using coherence , 2003, SODA '03.

[8]  Maurice Queyranne,et al.  An Exact Algorithm for Maximum Entropy Sampling , 1995, Oper. Res..

[9]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[10]  João M. F. Xavier,et al.  Sensor Selection for Event Detection in Wireless Sensor Networks , 2010, IEEE Transactions on Signal Processing.

[11]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[12]  Christos Boutsidis,et al.  Unsupervised feature selection for principal components analysis , 2008, KDD.

[13]  Mark Crovella,et al.  Network Kriging , 2005, IEEE Journal on Selected Areas in Communications.

[14]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[15]  George Michailidis,et al.  On Global Modeling of Backbone Network Traffic , 2010, 2010 Proceedings IEEE INFOCOM.

[16]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[17]  Walter Willinger,et al.  Self-Similar Network Traffic and Performance Evaluation , 2000 .

[18]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[19]  Michael G. Rabbat,et al.  Compressed network monitoring for ip and all-optical networks , 2007, IMC '07.

[20]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[21]  G. Michailidis,et al.  Identifiability of flow distributions from link measurements with applications to computer networks , 2007 .

[22]  George Michailidis,et al.  Optimal sampling in state space models with applications to network monitoring , 2008, SIGMETRICS '08.

[23]  G. Nemhauser,et al.  Maximizing Submodular Set Functions: Formulations and Analysis of Algorithms* , 1981 .

[24]  Abhimanyu Das,et al.  Algorithms for subset selection in linear regression , 2008, STOC.

[25]  G. Smaragdakis,et al.  Spatio-Temporal Network Anomaly Detection by Assessing Deviations of Empirical Measures , 2009, IEEE/ACM Transactions on Networking.

[26]  Christos Boutsidis,et al.  An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[27]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[28]  George Michailidis,et al.  Optimal experiment design in a filtering context with application to sampled network data , 2010, 1010.1126.

[29]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[30]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..