AOC: Assembling overlapping communities

Abstract Through discovery of mesoscale structures, community detection methods contribute to the understanding of complex networks. Many community finding methods, however, rely on disjoint clustering techniques, in which node membership is restricted to one community or cluster. This strict requirement limits the ability to inclusively describe communities because some nodes may reasonably be assigned to multiple communities. We have previously reported Iterative K-core Clustering, a scalable and modular pipeline that discovers disjoint research communities from the scientific literature. We now present Assembling Overlapping Clusters (AOC), a complementary metamethod for overlapping communities, as an option that addresses the disjoint clustering problem. We present findings from the use of AOC on a network of over 13 million nodes that captures recent research in the very rapidly growing field of extracellular vesicles in biology.

[1]  A. Clayton,et al.  Challenges and directions in studying cell–cell communication by extracellular vesicles , 2022, Nature Reviews Molecular Cell Biology.

[2]  Srijan Sengupta,et al.  Core-periphery structure in networks: A statistical exposition , 2022, Statistics Surveys.

[3]  T. Warnow,et al.  Center–periphery structure in research communities , 2021, Quantitative Science Studies.

[4]  G. Raposo,et al.  Extracellular vesicles and homeostasis—An emerging field in bioscience research , 2021, FASEB bioAdvances.

[5]  Frank Havemann,et al.  Topics as clusters of citation links to highly cited sources: The case of research on international relations , 2020, Quantitative Science Studies.

[6]  Tandy Warnow,et al.  Finding scientific communities in citation graphs: Articles and authors , 2020, Quantitative Science Studies.

[7]  Jean-Gabriel Young,et al.  A clarified typology of core-periphery structure in networks , 2020, Science Advances.

[8]  Nees Jan van Eck,et al.  Enhancing direct citations: A comparison of relatedness measures for community detection in a large set of PubMed publications , 2020, Quantitative Science Studies.

[9]  Michalis Vazirgiannis,et al.  The core decomposition of networks: theory, algorithms and applications , 2019, The VLDB Journal.

[10]  T. Warnow,et al.  Co-citations in context: Disciplinary heterogeneity is relevant , 2019, Quantitative Science Studies.

[11]  Yong Huang,et al.  A multidimensional perspective on the citation impact of scientific publications , 2019, ISSI.

[12]  Frank Havemann,et al.  Communities as Well Separated Subgraphs With Cohesive Cores: Identification of Core-Periphery Structures in Link Communities , 2018, COMPLEX NETWORKS.

[13]  Vincent A. Traag,et al.  From Louvain to Leiden: guaranteeing well-connected communities , 2018, Scientific Reports.

[14]  Christian Herzog,et al.  Dimensions: Building Context for Search and Evaluation , 2018, Front. Res. Metr. Anal..

[15]  Mason A. Porter,et al.  Core-Periphery Structure in Networks (Revisited) , 2017, SIAM Rev..

[16]  Zhao Yang,et al.  A Comparative Analysis of Community Detection Algorithms on Artificial Networks , 2016, Scientific Reports.

[17]  Xiao Zhang,et al.  Identification of core-periphery structure in networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[19]  Fei Wang,et al.  Overlapping Clustering with Sparseness Constraints , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[20]  Dino Pedreschi,et al.  A classification for community discovery methods in complex networks , 2011, Stat. Anal. Data Min..

[21]  Dimitrios M. Thilikos,et al.  Evaluating Cooperation in Communities with the k-Core Structure , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[22]  Santo Fortunato,et al.  Limits of modularity maximization in community detection , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Isabelle Guyon,et al.  Clustering: Science or Art? , 2009, ICML Unsupervised and Transfer Learning.

[24]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[25]  R. Lambiotte,et al.  Line graphs, link partitions, and overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Guillaume Cleuziou,et al.  An extended version of the k-means method for overlapping clustering , 2008, 2008 19th International Conference on Pattern Recognition.

[27]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[28]  Claudio Castellano,et al.  Community Structure in Graphs , 2007, Encyclopedia of Complexity and Systems Science.

[29]  Marek A. Bednarczyk,et al.  Limits of Modularity , 2006, Fundam. Informaticae.

[30]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[31]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Joydeep Ghosh,et al.  Model-based overlapping clustering , 2005, KDD '05.

[33]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[34]  Malik Magdon-Ismail,et al.  Efficient Identification of Overlapping Communities , 2005, ISI.

[35]  Martin G. Everett,et al.  Models of core/periphery structures , 2000, Soc. Networks.

[36]  François Rousset,et al.  GENEPOP (version 1.2): population genetic software for exact tests and ecumenicism , 1995 .

[37]  S. Iijima Helical microtubules of graphitic carbon , 1991, Nature.

[38]  A. Schwartz,et al.  Receptor-mediated endocytosis. , 1985, The Biochemical journal.

[39]  P. Stahl,et al.  Receptor-mediated endocytosis of transferrin and recycling of the transferrin receptor in rat reticulocytes , 1983, The Journal of cell biology.

[40]  D. Chubin State of the Field The Conceptualization of Scientific Specialties , 1976 .

[41]  R. Levy,et al.  Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. , 1972, Clinical chemistry.

[42]  D. Price,et al.  Collaboration in an invisible college. , 1966, The American psychologist.

[43]  K. Courtney,et al.  A new and rapid colorimetric determination of acetylcholinesterase activity. , 1961, Biochemical pharmacology.

[44]  R. Z. Norman,et al.  Some properties of line digraphs , 1960 .

[45]  Kevin W. Boyack,et al.  Creation and Analysis of Large-Scale Bibliometric Networks , 2019, Springer Handbook of Science and Technology Indicators.

[46]  W. Kuo,et al.  Extracellular Vesicles , 2017, Methods in Molecular Biology.

[47]  W. Myers,et al.  Atypical Combinations and Scientific Impact , 2013 .

[48]  K. Boyack,et al.  Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? , 2010, J. Assoc. Inf. Sci. Technol..

[49]  Sulla Derivabilita,et al.  RENDICONTI DEL CIRCOLO MATEMATICO DI PALERMO , 2008 .

[50]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[51]  Betsy Van der Veer Martens,et al.  Mapping research specialties , 2008, Annu. Rev. Inf. Sci. Technol..

[52]  D. Arnon COPPER ENZYMES IN ISOLATED CHLOROPLASTS. POLYPHENOLOXIDASE IN BETA VULGARIS. , 1949, Plant physiology.