ACO-based clustering for Ego Network analysis

Abstract The unstoppable growth of Social Networks (SNs), and the huge number of connected users, have become these networks as one of the most popular and successful domains for a large number of research areas. The different possibilities, volume and variety that these SNs offer, has become them an essential tool for every-day working and social relationships. One of the basic features that any SN provides is to allow users to group, organize and classify their connections into different groups, or “circles”. These circles can be defined using different characteristics as roommates, workmates, hobbies, professional skills, etc. The problem of finding these circles taking into account the variety, volume and dynamics of these SNs has become an important challenge for a wide number of Computer Science areas, as Big Data, Data Mining or Machine Learning among others. Problems related to pre-processing, fusion and knowledge discovering of information from these sources are still an open question. This paper presents a new Bio-inspired method, based on Ant Colony Optimization (ACO) algorithms, that has been designed to find and analyze these circles. Given any user in a network, the new method is able to automatically determine the different users that compose his/her groups or circles of interest, so the network will be clustered into different components based on the users profiles and their dynamics. This algorithm has been applied to Ego Networks where the node centering the network (called “Ego”) represents the user being studied. In this work two different ACO algorithms, that differ in the source of information used to perform the community finding tasks, have been designed. The first ACO algorithm uses the information extracted from the topology of the network, whereas the second one uses the profile information provided by users. The proposed algorithms are able to detect the different circles in three popular Social Networks: Facebook, Twitter and Google+. Finally, and using several databases from previous SNs, an experimental evaluation of our methods has been carried out to show how the algorithms are currently working.

[1]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[2]  Hiroshi Okamoto,et al.  Local Detection of Communities by Neural-Network Dynamics , 2013, ICANN.

[3]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[4]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[5]  Babak Amiri,et al.  A hybrid artificial immune network for detecting communities in complex networks , 2014, Computing.

[6]  Kevin Cheng,et al.  An ACO-Based Clustering Algorithm , 2006, ANTS Workshop.

[7]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Andries P. Engelbrecht,et al.  Computational Intelligence: An Introduction , 2002 .

[9]  Shigeo Abe,et al.  Neural Networks and Fuzzy Systems , 1996, Springer US.

[10]  Reinhard Lipowsky,et al.  Network Brownian Motion: A New Method to Measure Vertex-Vertex Proximity and to Identify Communities and Subcommunities , 2004, International Conference on Computational Science.

[11]  Elena Marchiori,et al.  Network community detection with edge classifiers trained on LFR graphs , 2013, ESANN.

[12]  H.M. Khodr,et al.  Ant colony system algorithm for the planning of primary distribution circuits , 2004, IEEE Transactions on Power Systems.

[13]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Jure Leskovec,et al.  Community-Affiliation Graph Model for Overlapping Network Community Detection , 2012, 2012 IEEE 12th International Conference on Data Mining.

[15]  Mohammad Reza Meybodi,et al.  Hybridization of K-Means and Harmony Search Methods for Web Page Clustering , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[16]  A. Sima Etaner-Uyar,et al.  An efficient community detection method using parallel clique-finding ants , 2010, IEEE Congress on Evolutionary Computation.

[17]  Chang Honghao,et al.  Community detection using Ant Colony Optimization , 2013, 2013 IEEE Congress on Evolutionary Computation.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Alex Alves Freitas,et al.  cAnt-Miner: An Ant Colony Classification Algorithm to Cope with Continuous Attributes , 2008, ANTS Conference.

[20]  Xiuzhen Zhang,et al.  Ant colony clustering with fitness perception and pheromone diffusion for community detection in complex networks , 2013 .

[21]  Javier Del Ser,et al.  Solving strategy board games using a CSP-based ACO approach , 2017, Int. J. Bio Inspired Comput..

[22]  Antonio González-Pardo,et al.  A new CSP graph-based representation to resource-constrained project scheduling problem , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[23]  Irene Poli,et al.  Naïve Bayes Ant Colony Optimization for Experimental Design , 2012, SMPS.

[24]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[25]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[26]  Hartmut Schmeck,et al.  Ant colony optimization for resource-constrained project scheduling , 2000, IEEE Trans. Evol. Comput..

[27]  Endre Boros,et al.  Pseudo-Boolean optimization , 2002, Discret. Appl. Math..

[28]  M. McPherson An Ecology of Affiliation , 1983 .

[29]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Jure Leskovec,et al.  Discovering social circles in ego networks , 2012, ACM Trans. Knowl. Discov. Data.

[31]  Steve Gregory,et al.  Finding overlapping communities in networks by label propagation , 2009, ArXiv.

[32]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[33]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[34]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Santosh S. Vempala,et al.  On clusterings: Good, bad and spectral , 2004, JACM.

[36]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[37]  Muhammad Abulaish,et al.  Classifier Ensembles Using Structural Features For Spammer Detection In Online Social Networks , 2015 .

[38]  Dario Pacciarelli,et al.  Ant colony optimization for the real-time train routing selection problem , 2016 .

[39]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[40]  Zhonghua Ni,et al.  Application of ant colony optimization algorithm in process planning optimization , 2013, J. Intell. Manuf..

[41]  Ying Ding,et al.  Community detection: Topological vs. topical , 2011, J. Informetrics.

[42]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[43]  L. Goddard Information Theory , 1962, Nature.

[44]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[45]  David M. Blei,et al.  Hierarchical relational models for document networks , 2009, 0909.4331.

[46]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[47]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Jae-Gil Lee,et al.  Scalable community detection from networks by computing edge betweenness on MapReduce , 2014, 2014 International Conference on Big Data and Smart Computing (BIGCOMP).

[49]  Mathew J. Palakal,et al.  A self organizing map-harmony search hybrid algorithm for clustering biological data , 2015, 2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES).

[51]  Christian Blum,et al.  Training feed-forward neural networks with ant colony optimization: an application to pattern classification , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[52]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[53]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[54]  Gillian Dobbie,et al.  Research on particle swarm optimization based clustering: A systematic review of literature and techniques , 2014, Swarm Evol. Comput..

[55]  Soundar R. T. Kumara,et al.  Clustering social networks using ant colony optimization , 2011, Operational Research.

[56]  Zbigniew Michalewicz,et al.  Handbook of Evolutionary Computation , 1997 .

[57]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[58]  David Camacho,et al.  Evolutionary clustering algorithm for community detection using graph-based information , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[59]  Joydeep Ghosh,et al.  A differential evolution algorithm to optimise the combination of classifier and cluster ensembles , 2015, Int. J. Bio Inspired Comput..

[60]  Antonio González-Pardo,et al.  A new CSP graph-based representation for Ant Colony Optimization , 2013, 2013 IEEE Congress on Evolutionary Computation.

[61]  Thomas Seidl,et al.  Efficient Mining of Combined Subspace and Subgraph Clusters in Graphs with Feature Vectors , 2013, PAKDD.

[62]  Lotfi Ben Romdhane,et al.  A robust ant colony optimization-based algorithm for community mining in large scale oriented social graphs , 2013, Expert Syst. Appl..

[63]  Carlo Ratti,et al.  A General Optimization Technique for High Quality Community Detection in Complex Networks , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[64]  Ujjwal Maulik,et al.  A Survey of Multiobjective Evolutionary Clustering , 2015, ACM Comput. Surv..

[65]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[66]  R. Steele Optimization , 2005 .

[67]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[68]  Alan S. Perelson,et al.  Self-nonself discrimination in a computer , 1994, Proceedings of 1994 IEEE Computer Society Symposium on Research in Security and Privacy.

[69]  Xiangpei Hu,et al.  An improved ant colony optimization and its application to vehicle routing problem with time windows , 2012, Neurocomputing.

[70]  William W. Cohen,et al.  Block-LDA: Jointly Modeling Entity-Annotated Text and Entity-Entity Links , 2014, Handbook of Mixed Membership Models and Their Applications.

[71]  Boleslaw K. Szymanski,et al.  Overlapping community detection in networks: The state-of-the-art and comparative study , 2011, CSUR.

[72]  Joachim M. Buhmann,et al.  Multi-assignment clustering for Boolean data , 2009, ICML '09.

[73]  Renaud Lambiotte,et al.  Line graphs of weighted networks for overlapping communities , 2010 .

[74]  Javier Del Ser,et al.  On the Applicability of Ant Colony Optimization to Non-Intrusive Load Monitoring in Smart Grids , 2015, CAEPIA.

[75]  Douglas A. Reynolds Gaussian Mixture Models , 2009, Encyclopedia of Biometrics.

[76]  Jason J. Jung,et al.  Social big data: Recent achievements and new challenges , 2015, Information Fusion.

[77]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[78]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[79]  Katarzyna Musial,et al.  Social networks on the Internet , 2012, World Wide Web.

[80]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.