Multi-Temporal Analysis and Scaling Relations of 100,000,000,000 Network Packets

Our society has never been more dependent on computer networks. Effective utilization of networks requires a detailed understanding of the normal background behaviors of network traffic. Large-scale measurements of networks are computationally challenging. Building on prior work in interactive supercomputing and GraphBLAS hypersparse hierarchical traffic matrices, we have developed an efficient method for computing a wide variety of streaming network quantities on diverse time scales. Applying these methods to 100,000,000,000 anonymized source-destination pairs collected at a network gateway reveals many previously unobserved scaling relationships. These observations provide new insights into normal network background traffic that could be used for anomaly detection, AI feature engineering, and testing theoretical models of streaming networks.

[1]  Akira Kato,et al.  Traffic Data Repository at the WIDE Project , 2000, USENIX Annual Technical Conference, FREENIX Track.

[2]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[3]  Jeremy Kepner,et al.  Streaming 1.9 Billion Hypersparse Network Updates per Second with D4M , 2019, 2019 IEEE High Performance Extreme Computing Conference (HPEC).

[4]  Jonathan W. Berry,et al.  Challenges in Parallel Graph Processing , 2007, Parallel Process. Lett..

[5]  George Bebis,et al.  A survey of network flow applications , 2013, J. Netw. Comput. Appl..

[6]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[7]  Aaron Clauset,et al.  Scale-free networks are rare , 2018, Nature Communications.

[8]  Russel E. Kaufman,et al.  “Science, the Endless Frontier” , 1960, Nature.

[9]  Kensuke Fukuda,et al.  Scaling in Internet Traffic: A 14 Year and 3 Day Longitudinal Study, With Multiscale Analyses and Random Projections , 2017, IEEE/ACM Transactions on Networking.

[10]  Zhenfeng Cao,et al.  Impact on the topology of power-law networks from anisotropic and localized access to information , 2017, Physical Review E.

[11]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[12]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[13]  Michael Stonebraker,et al.  A Demonstration of the BigDAWG Polystore System , 2015, Proc. VLDB Endow..

[14]  Dijiang Huang,et al.  Software-Defined Networking and Security , 2018 .

[15]  Marián Boguñá,et al.  Navigability of Complex Networks , 2007, ArXiv.

[16]  Michael Jones,et al.  75,000,000,000 Streaming Inserts/Second Using Hierarchical Hypersparse GraphBLAS Matrices , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[17]  Jeremy Kepner Parallel MATLAB - for Multicore and Multinode Computers , 2009, Software, environments, tools.

[18]  Robert Nowak,et al.  Internet tomography , 2002, IEEE Signal Process. Mag..

[19]  Kimberly C. Claffy Workshop on internet economics (WIE2009) report , 2010, CCRV.

[20]  Hiroshi Esaki,et al.  Synoptic Graphlet: Bridging the Gap Between Supervised and Unsupervised Profiling of Host-Level Network Traffic , 2013, IEEE/ACM Transactions on Networking.

[21]  kc claffy,et al.  Bandwidth estimation: metrics, measurement techniques, and tools , 2003, IEEE Netw..

[22]  Lada A. Adamic,et al.  Power-Law Distribution of the World Wide Web , 2000, Science.

[23]  Albert-László Barabási,et al.  Scale-Free Networks: A Decade and Beyond , 2009, Science.

[24]  Vaibhav Bajpai,et al.  Inferring persistent interdomain congestion , 2018, SIGCOMM.

[25]  Scott McMillan,et al.  Design of the GraphBLAS API for C , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[26]  Tinkara Toš,et al.  Graph Algorithms in the Language of Linear Algebra , 2012, Software, environments, tools.

[27]  Tim Kraska,et al.  The Case for Learned Index Structures , 2018 .

[28]  Kensuke Fukuda,et al.  Seven Years and One Day: Sketching the Evolution of Internet Traffic , 2009, IEEE INFOCOM 2009.

[29]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[30]  Martin Hilbert,et al.  The World’s Technological Capacity to Store, Communicate, and Compute Information , 2011, Science.

[31]  kc claffy,et al.  Workshop on Internet Economics (WIE 2019) report , 2020, Comput. Commun. Rev..

[32]  Carsten Lund,et al.  Estimating point-to-point and point-to-multipoint traffic matrices: an information-theoretic approach , 2005, IEEE/ACM Transactions on Networking.

[33]  A. Cichocki,et al.  MEASURING SPARSENESS OF NOISY SIGNALS , 2003 .

[34]  Ulrik Brandes,et al.  What is network science? , 2013, Network Science.

[35]  Antonio Pescapè,et al.  Issues and future directions in traffic classification , 2012, IEEE Network.

[36]  Jeremy Kepner,et al.  Hyperscaling Internet Graph Analysis with D4M on the MIT SuperCloud , 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC).

[37]  Jukka-Pekka Onnela,et al.  Community Structure in Time-Dependent, Multiscale, and Multiplex Networks , 2009, Science.

[38]  Emilio Leonardi,et al.  How to identify and estimate the largest traffic matrix elements in a dynamic environment , 2004, SIGMETRICS '04/Performance '04.

[39]  Matthew Roughan,et al.  Internet Traffic Matrices: A Primer , 2013 .

[40]  Report to the president of the United States on government contracting for research and development , 1962 .

[41]  Kimberly C. Claffy,et al.  Measuring the Internet , 2004, The Practical Handbook of Internet Computing.

[42]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[43]  Franz Franchetti,et al.  Mathematical foundations of the GraphBLAS , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[44]  Jeremy Kepner,et al.  Interactive Supercomputing on 40,000 Cores for Machine Learning and Data Analysis , 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC).

[45]  Emily H. Do,et al.  Classifying Anomalies for Network Security , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46]  Carey L. Williamson,et al.  A tale of the tails: Power-laws in internet measurements , 2013, IEEE Network.

[47]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.