Extracting Dense and Connected Communities in Dual Networks: An Alignment Based Algorithm

Networks-based models have been used to represent and analyse datasets in many fields such as computational biology, medical informatics and social networks. Nevertheless, it has been recently shown that, in their standard form, they are unable to capture some aspects of the investigated scenarios. Thus, more complex and enriched models, such as heterogeneous networks or dual networks, have been proposed. We focus on the latter model, which consists of a pair of networks having the same nodes but different edges. In dual networks, one network, called physical, has unweighted edges representing binary associations among nodes. The other is an edge-weighted one where weights represent the strength of the associations among nodes. Dual networks capture in a single model some aspects that cannot be described by using a standard model. Dual networks can be used, for instance, to capture a co-authorships network, where physical network represents co-authors. In contrast, the conceptual network is used to model topics sharing among a couple of authors by means of edge connections. This allows capturing similar interests among authors even though they are not co-authors. We propose an innovative algorithm to find the Densest Connected Subgraph (DCS) in dual networks. DCS is the largest density subgraph in the conceptual network, which is also connected in the physical network. A DCS represents a set of highly similar nodes. Moreover, since DCS is a computationally hard problem, we propose novel heuristics to solve it. We tested the proposed algorithm on social, biological, and co-authorship networks. Results demonstrate that our approach is efficient and is able to extract meaningful information from dual networks.

[1]  Jugal K. Kalita,et al.  A comparison of algorithms for the pairwise alignment of biological networks , 2014, Bioinform..

[2]  Mario Cannataro,et al.  μ-CS: An extension of the TM4 platform to manage Affymetrix binary data , 2010, BMC Bioinformatics.

[3]  Johan Håstad,et al.  Clique is hard to approximate within n/sup 1-/spl epsiv// , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[4]  Lei Meng,et al.  The post-genomic era of biological network alignment , 2015, EURASIP J. Bioinform. Syst. Biol..

[5]  Hisao Tamaki,et al.  Greedily Finding a Dense Subgraph , 2000, J. Algorithms.

[6]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[7]  Damian Szklarczyk,et al.  The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible , 2016, Nucleic Acids Res..

[8]  Charu C. Aggarwal,et al.  A Survey of Algorithms for Dense Subgraph Discovery , 2010, Managing and Mining Graph Data.

[9]  Tijana Milenkovic,et al.  MAGNA++: Maximizing Accuracy in Global Network Alignment via both node and edge conservation , 2015, Bioinform..

[10]  T. Ideker,et al.  Systematic interpretation of genetic interactions using protein networks , 2005, Nature Biotechnology.

[11]  Jiawei Han,et al.  Mining coherent dense subgraphs across massive biological networks for functional discovery , 2005, ISMB.

[12]  Mario Cannataro,et al.  Semantic similarity analysis of protein data: assessment with biological features and issues , 2012, Briefings Bioinform..

[13]  Pietro Hiram Guzzi,et al.  Survey of local and global biological network alignment: the need to reconcile the two sides of the same coin , 2017, Briefings Bioinform..

[14]  J. Håstad Clique is hard to approximate withinn1−ε , 1999 .

[15]  Kostas E. Psannis,et al.  Social networking data analysis tools & challenges , 2016, Future Gener. Comput. Syst..

[16]  Wojciech Szpankowski,et al.  Pairwise Alignment of Protein Interaction Networks , 2006, J. Comput. Biol..

[17]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[18]  Xiaoming Liu,et al.  Digger , 2018, ACM Trans. Knowl. Discov. Data.

[19]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[20]  R. Shamir,et al.  Pathway redundancy and protein essentiality revealed in the Saccharomyces cerevisiae interaction networks , 2007, Molecular systems biology.

[21]  P. Phillips Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems , 2008, Nature Reviews Genetics.

[22]  H. Mewes,et al.  Functional modules by relating protein interaction networks and gene expression. , 2003, Nucleic acids research.

[23]  M. Cannataro,et al.  AlignNemo: A Local Network Alignment Method to Integrate Homology and Topology , 2012, PloS one.

[24]  T. Milenković,et al.  Systems biology MAGNA 11 : Maximizing Accuracy in Global Network Alignment via both node and edge conservation , 2015 .

[25]  Mario Cannataro,et al.  Data mining and life sciences applications on the grid , 2013, WIREs Data Mining Knowl. Discov..

[26]  Brad T. Sherman,et al.  The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists , 2007, Genome Biology.

[27]  Mario Cannataro,et al.  L-HetNetAligner: A novel algorithm for Local Alignment of Heterogeneous Biological Networks , 2020, Scientific Reports.

[28]  Jiawei Han,et al.  Detection of Complexes in Biological Networks Through Diversified Dense Subgraph Mining , 2017, J. Comput. Biol..

[29]  Christian Ottmann,et al.  Protein-Protein Interactions. , 2017, Drug discovery today. Technologies.

[30]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[31]  Tijana Milenkovic,et al.  MAGNA: Maximizing Accuracy in Global Network Alignment , 2013, Bioinform..

[32]  Srinivasan Parthasarathy,et al.  Community Discovery in Social Networks: Applications, Methods and Emerging Trends , 2011, Social Network Data Analytics.

[33]  Roland A. Pache,et al.  A Novel Framework for the Comparative Analysis of Biological Networks , 2012, PloS one.

[34]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[35]  Samir Khuller,et al.  On Finding Dense Subgraphs , 2009, ICALP.

[36]  Andrew V. Goldberg,et al.  Finding a Maximum Density Subgraph , 1984 .

[37]  Mario Cannataro,et al.  Integrated analysis of microRNAs, transcription factors and target genes expression discloses a specific molecular architecture of hyperdiploid multiple myeloma , 2015, Oncotarget.

[38]  Aaron Striegel,et al.  IGLOO: Integrating global and local biological network alignment , 2016, 1604.06111.

[39]  Serafim Batzoglou,et al.  Automatic Parameter Learning for Multiple Local Network Alignment , 2009, J. Comput. Biol..

[40]  J. Jeffry Howbert,et al.  The Maximum Clique Problem , 2007 .

[41]  Li Li,et al.  Mining Dual Networks , 2016, ACM Trans. Knowl. Discov. Data.

[42]  Pietro Hiram Guzzi,et al.  Improving the Robustness of Local Network Alignment: Design and Extensive Assessmentof a Markov Clustering-Based Approach , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.