A benchmarking tool for the generation of bipartite network models with overlapping communities

Many real-world networks display hidden community structures with important potential implications in their dynamics. Many algorithms highly relevant to network analysis have been introduced to unveil community structures. Accurate assessment and comparison of alternative solutions are typically approached by benchmarking the target algorithm(s) on a set of diverse networks that exhibit a broad range of controlled features, ensuring the assessment contemplates multiple representative properties. Tools have been developed to synthesize bipartite networks, but none of the previous solutions address the issue of generating networks with overlapping community structures. This is the motivation for the BNOC tool introduced in this paper. It allows synthesizing bipartite networks that mimic a wide range of features from real-world networks, including overlapping community structures. Multiple parameters ensure flexibility in controlling the scale and topological properties of the networks and embedded communities. BNOC’s applicability is illustrated assessing and comparing two popular overlapping community detection algorithms on bipartite networks, namely HLC and OSLOM. Results reveal interesting features of the algorithms in this scenario and confirm the relevant role played by a suitable benchmarking tool. Finally, to validate our approach, we present results comparing networks synthesized with BNOC with those obtained with an existing benchmarking tool and with already established sets of synthetic networks, in two different scenarios.

[1]  Awrad Mohammed Ali,et al.  Synthetic Generators for Cloning Social Network Data , 2014 .

[2]  Stephen J Beckett,et al.  Improved community detection in weighted bipartite networks , 2016, Royal Society Open Science.

[3]  Daniel B. Larremore,et al.  Efficiently inferring community structure in bipartite networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[5]  Athena Vakali,et al.  Benchmark graphs for the evaluation of clustering algorithms , 2009, 2009 Third International Conference on Research Challenges in Information Science.

[6]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[7]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[8]  Bradley S. Rees,et al.  Overlapping community detection using a community optimized graph swarm , 2012, Social Network Analysis and Mining.

[9]  David F. Nettleton,et al.  A synthetic data generator for online social network graphs , 2016, Social Network Analysis and Mining.

[10]  Bin Wu,et al.  Overlapping Community Detection in Bipartite Networks , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[11]  Etienne Birmelé,et al.  A scale-free graph model based on bipartite graphs , 2009, Discret. Appl. Math..

[12]  Lakshmanan Kuppusamy,et al.  A survey on game theoretic models for community detection in social networks , 2016, Social Network Analysis and Mining.

[13]  Ali Aïtelhadj,et al.  Dual modularity optimization for detecting overlapping communities in bipartite networks , 2013, Knowledge and Information Systems.

[14]  Chao Gao,et al.  Combination methods for identifying influential nodes in networks , 2015 .

[15]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[16]  Muscoloni Alessandro,et al.  Leveraging the nonuniform PSO network model as a benchmark for performance evaluation in community detection and link prediction , 2018, New Journal of Physics.

[17]  David Melamed,et al.  Community Structures in Bipartite Networks: A Dual-Projection Approach , 2014, PloS one.

[18]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Researcher Anonymous Leveraging Evolution Dynamics to Generate Benchmark Complex Networks with Community Structures , 2017 .

[20]  Madhav V. Marathe,et al.  Generation and analysis of large synthetic social contact networks , 2009, Proceedings of the 2009 Winter Simulation Conference (WSC).

[21]  Wiro J. Niessen,et al.  Integrated Analysis and Visualization of Group Differences in Structural and Functional Brain Connectivity: Applications in Typical Ageing and Schizophrenia , 2015, PloS one.

[22]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Francesc Sebé,et al.  Synthetic generation of social network data with endorsements , 2014, J. Simulation.

[24]  Christos Faloutsos,et al.  RTG: A Recursive Realistic Graph Generator Using Random Typing , 2009, ECML/PKDD.

[25]  Qiang Yang,et al.  Modeling the dynamics of composite social networks , 2013, KDD.

[26]  L. Akoglu Quantifying Political Polarity Based on Bipartite Opinion Networks , 2014, ICWSM.

[27]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[28]  Matthieu Latapy,et al.  Basic notions for the analysis of large two-mode networks , 2008, Soc. Networks.

[29]  L. Bécu,et al.  Evidence for three-dimensional unstable flows in shear-banding wormlike micelles. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[31]  Tolga Uslu,et al.  PolyViz-a Visualization System for Special Kind of Multipartite Graphs , 2018 .

[32]  M. Newman,et al.  Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Peter A. Boncz LDBC: benchmarks for graph and RDF data management , 2013, IDEAS '13.

[34]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[35]  Christos Faloutsos,et al.  RTG: a recursive realistic graph generator using random typing , 2009, Data Mining and Knowledge Discovery.

[36]  Jelena Grujic,et al.  Movies Recommendation Networks as Bipartite Graphs , 2008, ICCS.

[37]  S. Lehmann,et al.  Biclique communities. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  Alneu de Andrade Lopes,et al.  Optimizing the class information divergence for transductive classification of texts using propagation in bipartite graphs , 2017, Pattern Recognit. Lett..

[39]  César A. Hidalgo,et al.  Scale-free networks , 2008, Scholarpedia.

[40]  Derek Greene,et al.  Normalized Mutual Information to evaluate overlapping community finding algorithms , 2011, ArXiv.

[41]  Lan Zhu,et al.  Basolateral Amygdala Inactivation Impairs Learning-Induced Long-Term Potentiation in the Cerebellar Cortex , 2011, PloS one.

[42]  Timothy G. Armstrong,et al.  LinkBench: a database benchmark based on the Facebook social graph , 2013, SIGMOD '13.

[43]  Jean-Charles Delvenne,et al.  Different approaches to community detection , 2017, Advances in Network Clustering and Blockmodeling.

[44]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[45]  Peter A. Boncz,et al.  S3G2: A Scalable Structure-Correlated Social Graph Generator , 2012, TPCTC.

[46]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[47]  Osmar R. Zaïane,et al.  Generating Attributed Networks with Communities , 2015, PloS one.

[48]  R. Folk,et al.  Poisson‐Voronoi核形成と成長変形における分域構造の時間発展:一次元と三次元の結果 , 2008 .

[49]  Santo Fortunato,et al.  Finding Statistically Significant Communities in Networks , 2010, PloS one.

[50]  Maria Cristina Ferreira de Oliveira,et al.  Multilevel approach for combinatorial optimization in bipartite network , 2018, Knowl. Based Syst..

[51]  Alneu de Andrade Lopes,et al.  A Multilevel Approach for Overlapping Community Detection , 2014, 2014 Brazilian Conference on Intelligent Systems.

[52]  Xingyuan Wang,et al.  Uncovering overlapping community structures by the key bi-community and intimate degree in bipartite networks , 2014 .

[53]  M E Newman,et al.  Scientific collaboration networks. I. Network construction and fundamental results. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[54]  Ricardo J. G. B. Campello,et al.  Communities validity: methodical evaluation of community mining algorithms , 2013, Social Network Analysis and Mining.

[55]  Alneu de Andrade Lopes,et al.  Identification of Related Brazilian Portuguese Verb Groups Using Overlapping Community Detection , 2014, PROPOR.

[56]  M. Barber Modularity and community detection in bipartite networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  Zhao Yang,et al.  Hierarchical benchmark graphs for testing community detection algorithms , 2017, Physical review. E.

[58]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[59]  Xiang-Sun Zhang,et al.  Mathematical Model and Algorithm for Link Community Detection in Bipartite Networks , 2015 .

[60]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[61]  Vipin Kumar,et al.  Robust and efficient identification of biomarkers by classifying features on graphs , 2008, Bioinform..

[62]  Yong-Yeol Ahn,et al.  Community detection in bipartite networks using weighted symmetric binary matrix factorization , 2015, ArXiv.

[63]  Alexandru Iosup,et al.  Graphalytics: A Big Data Benchmark for Graph-Processing Platforms , 2015, GRADES@SIGMOD/PODS.