An enhanced Graph Analytics Platform (GAP) providing insight in Big Network Data

Abstract Being a widely adapted and acknowledged practice for the representation of inter- and intra-dependent information streams, network graphs are nowadays growing vast in both size and complexity, due to the rapid expansion of sources, types, and amounts of produced data. In this context, the efficient processing of the big amounts of information, also known as Big Data forms a major challenge for both the research community and a wide variety of industrial sectors, involving security, health and financial applications. Serving these emerging needs, the current paper presents a Graph Analytics based Platform (GAP) that implements a top-down approach for the facilitation of Data Mining processes through the incorporation of state-of-the-art techniques, like behavioural clustering, interactive visualizations, multi-objective optimization, etc. The applicability of this platform is validated on 2 istinct real-world use cases, which can be considered as characteristic examples of modern Big Data problems, due to the vast amount of information they deal with. In particular, (i) the root cause analysis of a Denial of Service attack in the network of a mobile operator and (ii) the early detection of an emerging event or a hot topic in social media communities. In order to address the large volume of the data, the proposed application starts with an aggregated overview of the whole network and allows the operator to gradually focus on smaller sets of data, using different levels of abstraction. The proposed platform offers differentiation between different user behaviors that enable the analyst to obtain insight on the network’s operation and to extract the meaningful information in an effortless manner. Dynamic hypothesis formulation techniques exploited by graph traversing and pattern mining, enable the analyst to set concrete network-related hypotheses, and validate or reject them accordingly.

[1]  Danai Koutra,et al.  Graph based anomaly detection and description: a survey , 2014, Data Mining and Knowledge Discovery.

[2]  José M. F. Moura,et al.  Big Data Analysis with Signal Processing on Graphs: Representation and processing of massive data sets with irregular structure , 2014, IEEE Signal Processing Magazine.

[3]  Roger Piqueras Jover,et al.  Anomaly detection in cellular Machine-to-Machine communications , 2013, 2013 IEEE International Conference on Communications (ICC).

[4]  Dimitrios Tzovaras,et al.  A BRPCA Based Approach for Anomaly Detection in Mobile Networks , 2015, ISCIS.

[5]  Wenfei Fan,et al.  Graph pattern matching revised for social network analysis , 2012, ICDT '12.

[6]  Dimitrios Tzovaras,et al.  Multi-Objective Optimization for Multimodal Visualization , 2014, IEEE Transactions on Multimedia.

[7]  Dimitrios Tzovaras,et al.  Fast Frequent Episode Mining Based on Finite-State Machines , 2015, ISCIS.

[8]  Jong Kim,et al.  WarningBird: A Near Real-Time Detection System for Suspicious URLs in Twitter Stream , 2013, IEEE Transactions on Dependable and Secure Computing.

[9]  Yutaka Matsuo,et al.  Tweet Analysis for Real-Time Event Detection and Earthquake Reporting System Development , 2013, IEEE Transactions on Knowledge and Data Engineering.

[10]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Dimitrios Tzovaras,et al.  Visual Analytics for Enhancing Supervised Attack Attribution in Mobile Networks , 2014, ISCIS.

[12]  Dimitrios Tzovaras,et al.  A Novel Graph-Based Descriptor for the Detection of Billing-Related Anomalies in Cellular Mobile Networks , 2016, IEEE Transactions on Mobile Computing.

[13]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[14]  Bin Wu,et al.  Cell phone mini challenge award: Social network accuracy— exploring temporal communication in mobile call graphs , 2008, 2008 IEEE Symposium on Visual Analytics Science and Technology.

[15]  Julie K. Petersen Telecommunications Illustrated Dictionary , 2002 .

[16]  Dimitrios Tzovaras,et al.  A multi-objective clustering approach for the detection of abnormal behaviors in mobile networks , 2015, 2015 IEEE International Conference on Communication Workshop (ICCW).

[17]  Charu C. Aggarwal,et al.  Community Detection with Edge Content in Social Media Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[18]  Patrick P. C. Lee,et al.  On the Detection of Signaling DoS Attacks on 3G Wireless Networks , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[19]  Thomas F. La Porta,et al.  A Detection Mechanism for SMS Flooding Attacks in Cellular Networks , 2012, SecureComm.

[20]  Peter J. Haas,et al.  Interactive data Analysis: The Control Project , 1999, Computer.

[21]  Dimitrios Tzovaras,et al.  MoVA: A Visual Analytics Tool Providing Insight in the Big Mobile Network Data , 2015, AIAI.

[22]  Patrick P. C. Lee,et al.  On the detection of signaling DoS attacks on 3G/WiMax wireless networks , 2009, Comput. Networks.

[23]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[24]  Fabio Ricciato,et al.  Distribution-Based Anomaly Detection in Network Traffic , 2013, Data Traffic Monitoring and Analysis.

[25]  Kwan-Liu Ma,et al.  Large-Scale Graph Visualization and Analytics , 2013, Computer.

[26]  Niklas Elmqvist,et al.  Visual Analytics for Multimodal Social Network Analysis: A Design Study with Social Scientists , 2013, IEEE Transactions on Visualization and Computer Graphics.

[27]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[28]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[29]  Keitaro Naruse,et al.  Visualization of spread of topic words on Twitter using stream graphs and relational graphs , 2014, 2014 Joint 7th International Conference on Soft Computing and Intelligent Systems (SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS).

[30]  Jiawei Han,et al.  Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[31]  Ravi Sankar,et al.  A Survey of Intrusion Detection Systems in Wireless Sensor Networks , 2014, IEEE Communications Surveys & Tutorials.

[32]  Ravikiran Vatrapu,et al.  Social Set Analysis: A Set Theoretical Approach to Big Data Analytics , 2016, IEEE Access.

[33]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[34]  David S. Ebert,et al.  Spatiotemporal social media analytics for abnormal event detection and examination using seasonal-trend decomposition , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[35]  Fabio Ricciato,et al.  A Distribution-Based Approach to Anomaly Detection and Application to 3G Mobile Traffic , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[36]  Kwan-Liu Ma,et al.  MobiVis: A Visualization System for Exploring Mobile Data , 2008, 2008 IEEE Pacific Visualization Symposium.

[37]  Jimmy J. Lin,et al.  Information network or social network?: the structure of the twitter follow graph , 2014, WWW.

[38]  Rajiv Ranjan,et al.  Streaming Big Data Processing in Datacenter Clouds , 2014, IEEE Cloud Computing.

[39]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[40]  Jarke J. van Wijk,et al.  Reducing Snapshots to Points: A Visual Analytics Approach to Dynamic Network Exploration , 2016, IEEE Transactions on Visualization and Computer Graphics.

[41]  Valerie Daggett,et al.  DIVE: A Graph-Based Visual-Analytics Framework for Big Data , 2014, IEEE Computer Graphics and Applications.

[42]  Mohamed A. Sharaf,et al.  Emerging event detection in social networks with location sensitivity , 2014, World Wide Web.

[43]  M. Newman,et al.  Mixing Patterns and Community Structure in Networks , 2002, cond-mat/0210146.