Online social network trend discovery using frequent subgraph mining

Graph mining has become a well-established discipline within the domain of data mining. It has received much interest over the last decade as advances in computer hardware have provided the processing power to enable large-scale graph data mining to be conducted. Frequent subgraph mining (FSM) plays a very significant role in graph mining, attracting a great deal of attention in different domains, such as Bioinformatics, web data mining and social networks. Online social networks (SNs) play an important role in today’s Internet. These social networks contain huge amounts of data and present a challenging problem. FSM has been used in SNs to identify the frequent pattern trends existing in the network. A frequent pattern trend is defined as a sequence of time-stamped occurrences (support) value for specific frequent pattern that exist in the data. For example, most active researchers, most visited web pages or users’ navigation patterns over the web are few to mention. In the past few years, social network trend mining has been an active area of research. Many graph mining algorithms have been proposed, but a very limited effort exists for capturing an important dimension of SNs, which is trends discovery. Therefore, this paper introduces a novel FSM approach, called A-RAFF ( A RA nked F requent pattern-growth F ramework), to discovering and comparing the frequent pattern trends exist in the social network data. Furthermore, the social network frequent pattern trend analysis has been evaluated using two standard social networks, Facebook-like network and the famous MSNBC news network datasets. Consequently, the discovered trends will help the underlying social networks to further enhance their platforms for the betterment of the users as well as for their business growth.

[1]  Daniel J. Brass,et al.  Network Analysis in the Social Sciences , 2009, Science.

[2]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[3]  Alicia Fornés,et al.  Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases , 2017, Pattern Recognit. Lett..

[4]  Xianghan Zheng,et al.  ELM-based spammer detection in social networks , 2016, The Journal of Supercomputing.

[5]  Hossein Amirkhani,et al.  Bibliometrics of sentiment analysis literature , 2019, J. Inf. Sci..

[6]  Philip S. Yu,et al.  Social network and high performance in smart communications , 2013, The Journal of Supercomputing.

[7]  Tore Opsahl,et al.  Clustering in weighted networks , 2009, Soc. Networks.

[8]  Mahdi Jalili,et al.  Link prediction in multiplex online social networks , 2017, Royal Society Open Science.

[9]  Taku Kudo,et al.  Clustering graphs by weighted substructure mining , 2006, ICML.

[10]  G. Jin,et al.  The Information Value of Online Social Networks: Lessons from Peer-to-Peer Lending , 2014 .

[11]  K. Suresh Improved FCM Algorithm for Clustering on Web Usage Mining , 2011, 2011 International Conference on Computer and Management (CAMAN).

[12]  Simon Fong,et al.  An Efficient Ranking Scheme for Frequent Subgraph Patterns , 2018, ICMLC.

[13]  Simon James Fong,et al.  Optimized and Frequent Subgraphs: How Are They Related? , 2018, IEEE Access.

[14]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[15]  Valdis E. Krebs,et al.  Mapping Networks of Terrorist Cells , 2001 .

[16]  Philip S. Yu,et al.  Direct mining of discriminative and essential frequent patterns via model-based search tree , 2008, KDD.

[17]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[18]  Makarand Hastak,et al.  Social network analysis: Characteristics of online social networks after a disaster , 2018, Int. J. Inf. Manag..

[19]  R. GeethaRamani,et al.  Aggregated clustering for grouping of users based on web page navigation behaviour , 2019, Int. J. Reason. based Intell. Syst..

[20]  Detlef Schoder,et al.  Web Science 2.0: Identifying Trends through Semantic Social Network Analysis , 2008, 2009 International Conference on Computational Science and Engineering.

[21]  Thomas Maugey,et al.  Graph-Based Representation for Multiview Image Geometry , 2015, IEEE Transactions on Image Processing.

[22]  Philip S. Yu,et al.  Towards Graph Containment Search and Indexing , 2007, VLDB.

[23]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[24]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[25]  Frans Coenen,et al.  Finding Temporal Patterns in Noisy Longitudinal Data: A Study in Diabetic Retinopathy , 2010, ICDM.

[26]  Danah Boyd,et al.  Social Network Sites: Definition, History, and Scholarship , 2007, J. Comput. Mediat. Commun..

[27]  Lawrence B. Holder,et al.  Discovering Structural Anomalies in Graph-Based Data , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[28]  Shahaboddin Shamshirband,et al.  Community detection in social networks using user frequent pattern mining , 2016, Knowledge and Information Systems.

[29]  Wei Jin,et al.  SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs , 2010, Proc. VLDB Endow..

[30]  Sohail Asghar,et al.  A-RAFF: A Ranked Frequent Pattern-growth Subgraph Pattern Discovery Approach , 2019 .

[31]  Jiawei Han,et al.  On effective presentation of graph patterns: a structural representative approach , 2008, CIKM '08.

[32]  George Karypis,et al.  An efficient algorithm for discovering frequent subgraphs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[33]  Muhammad Abulaish,et al.  Analysis and mining of online social networks: emerging trends and challenges , 2013, WIREs Data Mining Knowl. Discov..

[34]  Saeed Jalili,et al.  High-performance parallel frequent subgraph discovery , 2015, The Journal of Supercomputing.

[35]  KimJooho,et al.  Social Network Analysis , 2018 .

[36]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[37]  Sinda Agrebi,et al.  Explain the intention to use smartphones for mobile shopping , 2015 .

[38]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[39]  Prashant Bhat,et al.  Web Video Object Mining: Expectation Maximization and Density Based Clustering of Web Video Metadata Objects , 2016 .

[40]  Qiang Qu,et al.  Summarisation of weighted networks , 2017, J. Exp. Theor. Artif. Intell..

[41]  S. S. Sonawane,et al.  Graph based Representation and Analysis of Text Document: A Survey of Techniques , 2014 .

[42]  S. Dongen Graph clustering by flow simulation , 2000 .

[43]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[44]  Binayak Panda,et al.  A Comparative Study on Serial and Parallel Web Content Mining , 2016 .

[45]  Munmun De Choudhury,et al.  Can blog communication dynamics be correlated with stock market activity? , 2008, Hypertext.

[46]  Tore Opsahl,et al.  Prominence and control: the weighted rich-club effect. , 2008, Physical review letters.

[47]  Ping Guo,et al.  Frequent mining of subgraph structures , 2006, J. Exp. Theor. Artif. Intell..

[48]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[49]  Hannu Toivonen,et al.  Finding Frequent Substructures in Chemical Compounds , 1998, KDD.

[50]  T. Santhanam,et al.  Automatic Recommendation of Web Pages in Web Usage Mining , 2010 .

[51]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[52]  Saba Jameel,et al.  An optimal feature selection method using a modified wrapper-based ant colony optimisation , 2018, Journal of the National Science Foundation of Sri Lanka.

[53]  Simon Fong,et al.  Performance Evaluation of Frequent Subgraph Discovery Techniques , 2014 .

[54]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[55]  Philip S. Yu,et al.  Searching Substructures with Superimposed Distance , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[56]  Bhavani M. Thuraisingham,et al.  Preventing Private Information Inference Attacks on Social Networks , 2013, IEEE Transactions on Knowledge and Data Engineering.

[57]  Korris Fu-Lai Chung,et al.  Using Emerging Pattern Based Projected Clustering and Gene Expression Data for Cancer Detection , 2004, APBC.

[58]  George Karypis,et al.  Frequent substructure-based approaches for classifying chemical compounds , 2003, IEEE Transactions on Knowledge and Data Engineering.

[59]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[60]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[61]  Lawrence B. Holder,et al.  Substucture Discovery in the SUBDUE System , 1994, KDD Workshop.

[62]  Mohammad Al Hasan,et al.  FS3: A sampling based method for top-k frequent subgraph mining , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[63]  A. John MINING GRAPH DATA , 2022 .