Spark-Based Parallel Method for Prediction of Events

Prediction of events is imperative in many areas of social network (SN) applications. These events influence the temporal evolutionary characteristic of social networks. A study of these events can give better insights to understand the evolutionary patterns (communities) in social networks. One of the major challenges in such implementation is the processing and structuring of large datasets to suit ML models. This paper proposes a Spark-based parallel method for detection, mining, and prediction of these events that influence the evolution of communities in a temporal SN. The proposed framework processes large temporal data (taken from the DBLP dataset), uses parallel algorithms to detect the structural changes, and applies ML techniques to predict the future structural changes (events). The proposed methodology uses ensemble ML methods in the Spark ML pipeline to achieve the desired performance and accuracy. The experimental results justify that the proposed framework can predict future events with an accuracy of 82% and saves 99% of computational time.

[1]  P. K. Srijith,et al.  Accelerating Hawkes process for event history data: Application to social networks and recommendation systems , 2018, 2018 10th International Conference on Communication Systems & Networks (COMSNETS).

[2]  Yang Liu,et al.  Relationship Emergence Prediction in Heterogeneous Networks through Dynamic Frequent Subgraph Mining , 2014, CIKM.

[3]  Kai-Yeung Siu,et al.  New dynamic algorithms for shortest path tree computation , 2000, TNET.

[4]  Vitali Herrera-Semenets,et al.  A novel rule generator for intrusion detection based on frequent subgraph mining , 2017 .

[5]  David Lo,et al.  Hierarchical Parallel Algorithm for Modularity-Based Community Detection Using GPUs , 2013, Euro-Par.

[6]  Jun Peng,et al.  A New MapReduce Approach with Dynamic Fuzzy Inference for Big Data Classification Problems , 2018, Int. J. Cogn. Informatics Nat. Intell..

[7]  Toyotaro Suzumura,et al.  Adaptive Pattern Matching with Reinforcement Learning for Dynamic Graphs , 2018, 2018 IEEE 25th International Conference on High Performance Computing (HiPC).

[8]  Charu C. Aggarwal,et al.  An Introduction to Social Network Data Analytics , 2011, Social Network Data Analytics.

[9]  Osmar R. Zaïane,et al.  Incremental local community identification in dynamic social networks , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[10]  Przemyslaw Kazienko,et al.  GED: the method for group evolution discovery in social networks , 2012, Social Network Analysis and Mining.

[11]  Keshav Pingali,et al.  The tao of parallelism in algorithms , 2011, PLDI '11.

[12]  David A. Padua,et al.  DSMR: A Parallel Algorithm for Single-Source Shortest Path Problem , 2016, ICS.

[13]  Hui Wang,et al.  Computational Approach to Detecting and Predicting Occupy Protest Events , 2015, 2015 International Conference on Identification, Information, and Knowledge in the Internet of Things (IIKI).

[14]  Jitender S. Deogun,et al.  Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method , 2004, Rough Sets and Current Trends in Computing.

[15]  Derek Greene,et al.  Tracking the Evolution of Communities in Dynamic Social Networks , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[16]  Song Gao,et al.  Discovering Spatial Interaction Communities from Mobile Phone Data , 2013 .

[17]  Naren Ramakrishnan,et al.  Detecting and forecasting domestic political crises: a graph-based approach , 2014, WebSci '14.

[18]  Jia Guo,et al.  Achieving Performance and Programmability for MapReduce(-Like) Frameworks , 2018, 2018 IEEE 25th International Conference on High Performance Computing (HiPC).

[19]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[20]  Yunjun Gao,et al.  A Fast Parallel Community Discovery Model on Complex Networks Through Approximate Optimization , 2018, IEEE Transactions on Knowledge and Data Engineering.

[21]  Srinivasan Parthasarathy,et al.  An event-based framework for characterizing the evolutionary behavior of interaction graphs , 2009, ACM Trans. Knowl. Discov. Data.

[22]  Srinivasan Parthasarathy,et al.  Hierarchical Change Point Detection on Dynamic Networks , 2017, WebSci.

[23]  Nectarios Koziris,et al.  Employing Transactional Memory and Helper Threads to Speedup Dijkstra's Algorithm , 2009, 2009 International Conference on Parallel Processing.

[24]  Jianping Zeng,et al.  A study of graph partitioning schemes for parallel graph community detection , 2016, Parallel Comput..

[25]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Kishore Kothapalli,et al.  Expediting Parallel Graph Connectivity Algorithms , 2018, 2018 IEEE 25th International Conference on High Performance Computing (HiPC).

[27]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[28]  Patrick Paroubek,et al.  Extracting Sentiment Patterns from Syntactic Graphs , 2013 .

[29]  Sanjay Rathee,et al.  Adaptive-Miner: an efficient distributed association rule mining algorithm on Spark , 2018, Journal of Big Data.

[30]  Niloy Ganguly,et al.  Learning and Forecasting Opinion Dynamics in Social Networks , 2015, NIPS.

[31]  Srinivasan Parthasarathy,et al.  Adapting Community Detection Algorithms for Disease Module Identification in Heterogeneous Biological Networks , 2019, Front. Genet..

[32]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[33]  Ghulam Rasool,et al.  Evolution Prediction and Process Support of OSS Studies: A Systematic Mapping , 2017 .

[34]  Sung-Eui Yoon,et al.  Discriminative subgraphs for discovering family photos , 2016, Computational Visual Media.

[35]  Xueyan Liu,et al.  Community evolution mining and analysis in social network , 2017 .

[36]  Hrushikesha Mohanty,et al.  Event Detection and Aspects in Twitter: A BoW Approach , 2018, ICDCIT.

[37]  A. Barabasi,et al.  Quantifying social group evolution , 2007, Nature.

[38]  Sohail Asghar,et al.  Author Name Disambiguation by Exploiting Graph Structural Clustering and Hybrid Similarity , 2018 .