QuickSAND: Quick Summary and Analysis of Network Data

Monitoring and analyzing tra c data generated from large ISP networks imposes challenges both at the data gathering phase as well as the data analysis itself. Still both tasks are crucial for responding to day to day challenges of engineering large networks with thousands of customers. In this paper we build on the premise that approximation is a necessary evil of handling massive datasets such as network data. We propose building compact summaries of the tra c data called sketches at distributed network elements and centers. These sketches are able to respond well to queries that seek features that stand out of the data. We call such features \heavy hitters." In this paper, we describe sketches and show how to use sketches to answer aggregate and trend-related queries and identify heavy hitters. This may be used for exploratory data analysis of network operations interest. We support our proposal by experimentally studying AT&T WorldNet data and performing a feasibility study on the Cisco NetFlow data collected at several routers.

[1]  S. Muthukrishnan,et al.  Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries , 2001, VLDB.

[2]  Anja Feldmann,et al.  Deriving traffic demands for operational IP networks: methodology and experience , 2000, SIGCOMM.

[3]  Balachander Krishnamurthy,et al.  On network-aware clustering of Web clients , 2000, SIGCOMM.

[4]  D. Veitch Wavelet Analysis of Long Range Dependent Traac , 1998 .

[5]  Anja Feldmann,et al.  Data networks as cascades: investigating the multifractal nature of Internet WAN traffic , 1998, SIGCOMM '98.

[6]  Anja Feldmann,et al.  The changing nature of network traffic: scaling phenomena , 1998, CCRV.

[7]  Anja Feldmann,et al.  Dynamics of IP traffic: a study of the role of variability and the impact of control , 1999, SIGCOMM '99.

[8]  Yossi Matias,et al.  DIMACS Series in Discrete Mathematicsand Theoretical Computer Science Synopsis Data Structures for Massive Data , 2007 .

[9]  Leon W. Cohen,et al.  Conference Board of the Mathematical Sciences , 1963 .

[10]  Noga Alon,et al.  The space complexity of approximating the frequency moments , 1996, STOC '96.

[11]  Ramon C. Barquin,et al.  Building, using, and managing the data warehouse , 1997 .

[12]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[13]  Anja Feldmann,et al.  Scaling Analysis of Conservative Cascades, with Applications to Network Traffic , 1999, IEEE Trans. Inf. Theory.

[14]  Prabhakar Raghavan,et al.  Computing on data streams , 1999, External Memory Algorithms.

[15]  Alan M. Frieze,et al.  Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..