Community detection in social networks using user frequent pattern mining

Recently, social networking sites are offering a rich resource of heterogeneous data. The analysis of such data can lead to the discovery of unknown information and relations in these networks. The detection of communities including ‘similar’ nodes is a challenging topic in the analysis of social network data, and it has been widely studied in the social networking community in the context of underlying graph structure. Online social networks, in addition to having graph structures, include effective user information within networks. Using this information leads to enhance quality of community discovery. In this study, a method of community discovery is provided. Besides communication among nodes to improve the quality of the discovered communities, content information is used as well. This is a new approach based on frequent patterns and the actions of users on networks, particularly social networking sites where users carry out their preferred activities. The main contributions of proposed method are twofold: First, based on the interests and activities of users on networks, some small communities of similar users are discovered, and then by using social relations, the discovered communities are extended. The F-measure is used to evaluate the results of two real-world datasets (Blogcatalog and Flickr), demonstrating that the proposed method principals to improve the community detection quality.

[1]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[2]  Sergiy Butenko,et al.  Clique Relaxations in Social Network Analysis: The Maximum k-Plex Problem , 2011, Oper. Res..

[3]  S. Lehmann,et al.  Biclique communities. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Reda Alhajj,et al.  Identifying Social Communities by Frequent Pattern Mining , 2009, 2009 13th International Conference Information Visualisation.

[5]  Osmar R. Zaïane,et al.  Top Leaders Community Detection Approach in Information Networks , 2010 .

[6]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Hiroki Arimura,et al.  LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining , 2005 .

[8]  Martin G. Everett,et al.  Exact colorations of graphs and digraphs , 1996 .

[9]  A. Banerjee,et al.  Social Topic Models for Community Extraction , 2008 .

[10]  Rushed Kanawati,et al.  LICOD: Leaders Identification for Community Detection in Complex Networks , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[11]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[12]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[13]  Christian Komusiewicz,et al.  Isolation concepts for efficiently enumerating dense subgraphs , 2009, Theor. Comput. Sci..

[14]  Ying Xuan,et al.  Towards social-aware routing in dynamic communication networks , 2009, 2009 IEEE 28th International Performance Computing and Communications Conference.

[15]  Maria Virvou,et al.  Mining relationships among user clusters in Facebook for language learning , 2013, 2013 International Conference on Computer, Information and Telecommunication Systems (CITS).

[16]  Limsoon Wong,et al.  Maintenance of Frequent Patterns: A Survey , 2009 .

[17]  Kazuo Iwama,et al.  Enumeration of isolated cliques and pseudo-cliques , 2009, TALG.

[18]  C. Lee Giles,et al.  Self-Organization and Identification of Web Communities , 2002, Computer.

[19]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[20]  Laks V. S. Lakshmanan,et al.  Discovering leaders from community actions , 2008, CIKM '08.

[21]  Yun Chi,et al.  Combining link and content for community detection: a discriminative approach , 2009, KDD.

[22]  Dino Pedreschi,et al.  A classification for community discovery methods in complex networks , 2011, Stat. Anal. Data Min..

[23]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Cliff Lampe,et al.  The ties that bind: Social network principles in online communities , 2009, Decis. Support Syst..

[25]  Peter Druschel,et al.  Online social networks: measurement, analysis, and applications to distributed information systems , 2009 .

[26]  L. Venkata Subramaniam,et al.  Using content and interactions for discovering communities in social networks , 2012, WWW.

[27]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[28]  Mao-Bin Hu,et al.  Detect overlapping and hierarchical community structure in networks , 2008, ArXiv.

[29]  Chris H Wiggins,et al.  Bayesian approach to network modularity. , 2007, Physical review letters.

[30]  Aristides Gionis,et al.  Mining Graph Evolution Rules , 2009, ECML/PKDD.

[31]  Kazumi Saito,et al.  Extracting Communities from Complex Networks by the k-Dense Method , 2008, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[32]  Martin G. Everett,et al.  Graph colorings and power in experimental exchange networks , 1992 .

[33]  Martin Franz,et al.  Unsupervised and supervised clustering for topic tracking , 2001, SIGIR '01.

[34]  Vagner Figuerêdo de Santana,et al.  WELFIT: A remote evaluation tool for identifying Web usage patterns through client-side logging , 2015, Int. J. Hum. Comput. Stud..

[35]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[36]  Christos Faloutsos,et al.  HCDF: A Hybrid Community Discovery Framework , 2010, SDM.

[37]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[38]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2004, IEEE International Parallel and Distributed Processing Symposium.

[39]  Chen Wu,et al.  Finding Influential eBay Buyers for Viral Marketing A Conceptual Model of BuyerRank , 2009, 2009 International Conference on Advanced Information Networking and Applications.

[40]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[41]  Srinivasan Parthasarathy,et al.  Scalable graph clustering using stochastic flows: applications to community discovery , 2009, KDD.

[42]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[43]  Charu C. Aggarwal,et al.  Social Network Data Analytics , 2011 .

[44]  Kazuo Iwama,et al.  Linear-Time Enumeration of Isolated Cliques , 2005, ESA.

[45]  Hongyuan Zha,et al.  Probabilistic models for discovering e-communities , 2006, WWW '06.

[46]  Ian Witten,et al.  Data Mining , 2000 .

[47]  Martin Bichler,et al.  Identification of influencers - Measuring influence in customer networks , 2008, Decis. Support Syst..

[48]  Stephen Shaoyi Liao,et al.  A graph-based action network framework to identify prestigious members through member's prestige evolution , 2012, Decis. Support Syst..

[49]  Hai Zhuge,et al.  Communities and Emerging Semantics in Semantic Link Network: Discovery and Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[50]  Charu C. Aggarwal,et al.  Community Detection with Edge Content in Social Media Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[51]  Nam P. Nguyen,et al.  Adaptive algorithms for detecting community structure in dynamic social networks , 2011, 2011 Proceedings IEEE INFOCOM.

[52]  Hong Cheng,et al.  Graph Clustering Based on Structural/Attribute Similarities , 2009, Proc. VLDB Endow..

[53]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[54]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[55]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[56]  Sencun Zhu,et al.  A Social Network Based Patching Scheme for Worm Containment in Cellular Networks , 2009, IEEE INFOCOM 2009.

[57]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[58]  Antonino Nocera,et al.  Recommendation of similar users, resources and social networks in a Social Internetworking Scenario , 2011, Inf. Sci..