Inference of social network behavior from Internet traffic traces

All network traffic is a byproduct of social networking. In this paper, Anonymized Internet (IP) Trace Datasets obtained from the Center for Applied Internet Data Analysis (CAIDA) has been used to identify and estimate characteristics of the underlying social network from the overall traffic. The analysis methods used here fall into two groups, the first being based on frequency analysis and second method being based on the use of traffic matrices, with the later analysis method being further sub-divided into groups based on the traffic mean, variance and covariance. The frequency analysis of origin (O), destination (D) and O-D Pair statistics exhibit heavy tailed behavior. Because the large number of IP addresses contained in the CAIDA Datasets, only the most predominate IP Addresses are used when estimating all three sub-divided groups of traffic matrices. Principal Component Analysis (PCA) and related methods are applied to identify key features of each type of traffic matrix. A new system called Antraff has been developed to carry out all the analysis procedures.

[1]  Konstantina Papagiannaki,et al.  Structural analysis of network traffic flows , 2004, SIGMETRICS '04/Performance '04.

[2]  Samuli Aalto,et al.  Characteristics of origin-destination pair traffic in Funet , 2006, International Conference on Networking, International Conference on Systems and International Conference on Mobile Communications and Learning Technologies (ICNICONSMCL'06).

[3]  Vijay Erramilli,et al.  An independent-connection model for traffic matrices , 2006, IMC '06.

[4]  Ronald G. Addie Antraff traffic analysis software user manual , 2016 .

[5]  Jennifer Rexford,et al.  Sensitivity of PCA for traffic anomaly detection , 2007, SIGMETRICS '07.

[6]  Mostafa H. Ammar,et al.  Prefix-preserving IP address anonymization: measurement-based security evaluation and a new cryptography-based scheme , 2004, Comput. Networks.

[7]  B. Chandrasekaran Survey of Network Traffic Models , 2006 .

[8]  A. Adas,et al.  Traffic models in broadband networks , 1997, IEEE Commun. Mag..

[9]  Y. Vardi,et al.  Network Tomography: Estimating Source-Destination Traffic Intensities from Link Data , 1996 .

[10]  Azer Bestavros,et al.  Self-similarity in World Wide Web traffic: evidence and possible causes , 1996, SIGMETRICS '96.

[11]  N.D. Georganas,et al.  Self-Similar Processes in Communications Networks , 1998, IEEE Trans. Inf. Theory.