Improving the Robustness of Local Network Alignment: Design and Extensive Assessmentof a Markov Clustering-Based Approach

The analysis of protein behavior at the network level had been applied to elucidate the mechanisms of protein interaction that are similar in different species. Published network alignment algorithms proved to be able to recapitulate known conserved modules and protein complexes, and infer new conserved interactions confirmed by wet lab experiments. In the meantime, however, a plethora of continuously evolving protein-protein interaction (PPI) data sets have been developed, each featuring different levels of completeness and reliability. For instance, algorithms performance may vary significantly when changing the data set used in their assessment. Moreover, existing papers did not deeply investigate the robustness of alignment algorithms. For instance, some algorithms performances vary significantly when changing the data set used in their assessment. In this work, we design an extensive assessment of current algorithms discussing the robustness of the results on the basis of input networks. We also present AlignMCL, a local network alignment algorithm based on an improved model of alignment graph and Markov Clustering. AlignMCL performs better than other state-of-the-art local alignment algorithms over different updated data sets. In addition, AlignMCL features high levels of robustness, producing similar results regardless the selected data set.

[1]  Gary D. Bader,et al.  clusterMaker: a multi-algorithm clustering plugin for Cytoscape , 2011, BMC Bioinformatics.

[2]  Mark Gerstein,et al.  Bridging structural biology and genomics: assessing protein interaction data with known complexes. , 2002, Drug discovery today.

[3]  Mary Ellen Bock,et al.  I STITUTO DI A NALISI DEI S ISTEMI ED I NFORMATICA “ Antonio Ruberti , 2012 .

[4]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[5]  Roded Sharan,et al.  Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data , 2004, J. Comput. Biol..

[6]  Guozhen Liu,et al.  DroID: the Drosophila Interactions Database, a comprehensive resource for annotated gene and protein interactions , 2008, BMC Genomics.

[7]  Johannes Berg,et al.  Cross-species analysis of biological networks by Bayesian alignment. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. Pu,et al.  Up-to-date catalogues of yeast protein complexes , 2008, Nucleic acids research.

[9]  A. Barabasi,et al.  The network takeover , 2011, Nature Physics.

[10]  Xin Li,et al.  Phylogenetic analysis of modularity in protein interaction networks , 2009, BMC Bioinformatics.

[11]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[12]  Wayne Hayes,et al.  Optimal Network Alignment with Graphlet Degree Vectors , 2010, Cancer informatics.

[13]  Hai Hu,et al.  Assessing semantic similarity measures for the characterization of human regulatory pathways , 2006, Bioinform..

[14]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[15]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[16]  Serafim Batzoglou,et al.  Automatic Parameter Learning for Multiple Local Network Alignment , 2009, J. Comput. Biol..

[17]  Charlotte M. Deane,et al.  Functionally guided alignment of protein interaction networks for module detection , 2009, Bioinform..

[18]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Julian Mintseris,et al.  A Protein Complex Network of Drosophila melanogaster , 2011, Cell.

[20]  Xiaomei Quan,et al.  Survey: Functional Module Detection from Protein-Protein Interaction Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[21]  Mario Cannataro,et al.  Semantic similarity analysis of protein data: assessment with biological features and issues , 2012, Briefings Bioinform..

[22]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[23]  Igor Jurisica,et al.  Online Predicted Human Interaction Database , 2005, Bioinform..

[24]  Roded Sharan,et al.  Identification of conserved protein complexes based on a model of protein network evolution , 2007, Bioinform..

[25]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[26]  Pietro Hiram Guzzi,et al.  AlignMCL: Comparative analysis of protein interaction networks through Markov clustering , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[27]  Haruki Nakamura,et al.  HINT: a database of annotated protein-protein interactions and their homologs , 2005, Biophysics.

[28]  D. Lipman,et al.  National Center for Biotechnology Information , 2019, Springer Reference Medizin.

[29]  Elena Marchiori,et al.  A methodology for detecting the orthology signal in a PPI network at a functional complex level , 2012, BMC Bioinformatics.

[30]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[31]  Bonnie Berger,et al.  An integrative approach to ortholog prediction for disease-focused and other functional studies , 2011, BMC Bioinformatics.

[32]  Wojciech Szpankowski,et al.  Pairwise Alignment of Protein Interaction Networks , 2006, J. Comput. Biol..

[33]  Andrew D. King Graph clustering with restricted neighbourhood search , 2004 .

[34]  S. Lovell,et al.  Protein-protein interaction networks and biology—what's the connection? , 2008, Nature Biotechnology.

[35]  Erich E. Wanker,et al.  Comparison of Human Protein-Protein Interaction Maps , 2007, German Conference on Bioinformatics.

[36]  Carsten Wiuf,et al.  The effects of incomplete protein interaction data on structural and evolutionary inferences , 2006, BMC Biology.

[37]  S. Dongen Graph clustering by flow simulation , 2000 .

[38]  Catia Pesquita,et al.  Metrics for GO based protein semantic similarity: a systematic evaluation , 2008, BMC Bioinformatics.

[39]  Natasa Przulj,et al.  Integrative network alignment reveals large regions of global network similarity in yeast and human , 2011, Bioinform..

[40]  Srinivasan Parthasarathy,et al.  Identifying functional modules in interaction networks through overlapping Markov clustering , 2012, Bioinform..

[41]  P. Bork,et al.  Proteome survey reveals modularity of the yeast cell machinery , 2006, Nature.

[42]  M. Cannataro,et al.  AlignNemo: A Local Network Alignment Method to Integrate Homology and Topology , 2012, PloS one.

[43]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[44]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[45]  Blatt,et al.  Superparamagnetic clustering of data. , 1998, Physical review letters.

[46]  K. Gunsalus,et al.  Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network , 2009, Nature Methods.

[47]  Roland A. Pache,et al.  A Novel Framework for the Comparative Analysis of Biological Networks , 2012, PloS one.

[48]  Robert Patro,et al.  Global network alignment using multiscale spectral signatures , 2012, Bioinform..

[49]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[50]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[51]  Martin H. Schaefer,et al.  HIPPIE: Integrating Protein Interaction Networks with Experiment Based Quality Scores , 2012, PloS one.

[52]  Ncbi National Center for Biotechnology Information , 2008 .

[53]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[54]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.