SAlign–a structure aware method for global PPI network alignment

Background High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. Studying complete protein networks can reveal more insight about healthy/disease states than studying proteins in isolation. Similarly, a comparative study of protein–protein interaction (PPI) networks of different species reveals important insights which may help in disease analysis and drug design. The study of PPI network alignment can also helps in understanding the different biological systems of different species. It can also be used in transfer of knowledge across different species. Different aligners have been introduced in the last decade but developing an accurate and scalable global alignment algorithm that can ensures the biological significance alignment is still challenging. Results This paper presents a novel global pairwise network alignment algorithm, SAlign, which uses topological and biological information in the alignment process. The proposed algorithm incorporates sequence and structural information for computing biological scores, whereas previous algorithms only use sequence information. The alignment based on the proposed technique shows that the combined effect of structure and sequence results in significantly better pairwise alignments. We have compared SAlign with state-of-art algorithms on the basis of semantic similarity of alignment and the number of aligned nodes on multiple PPI network pairs. The results of SAlign on the network pairs which have high percentage of proteins with available structure are 3–63% semantically better than all existing techniques. Furthermore, it also aligns 5–14% more nodes of these network pairs as compared to existing aligners. The results of SAlign on other PPI network pairs are comparable or better than all existing techniques. We also introduce \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {SAlign}^{\mathrm{mc}}$$\end{document}SAlignmc, a Monte Carlo based alignment algorithm, that produces multiple network alignments with similar semantic similarity. This helps the user to pick biologically meaningful alignments. Conclusion The proposed algorithm has the ability to find the alignments that are more biologically significant/relevant as compared to the alignments of existing aligners. Furthermore, the proposed method is able to generate alternate alignments that help in studying different genes/proteins of the specie.

[1]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[2]  Jinbo Xu,et al.  HubAlign: an accurate and efficient method for global alignment of protein–protein interaction networks , 2014, Bioinform..

[3]  KingsfordCarl,et al.  Global network alignment using multiscale spectral signatures , 2012 .

[4]  Yibo Wu,et al.  GOSemSim: an R package for measuring semantic similarity among GO terms and gene products , 2010, Bioinform..

[5]  Tijana Milenkovic,et al.  MAGNA: Maximizing Accuracy in Global Network Alignment , 2013, Bioinform..

[6]  Tijana Milenkovic,et al.  GREAT: GRaphlet Edge-based network AlignmenT , 2014, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[7]  Aaron Striegel,et al.  Local versus global biological network alignment , 2015, Bioinform..

[8]  Dana M. Bis-Brewer,et al.  A network biology approach to unraveling inherited axonopathies , 2019, Scientific Reports.

[9]  Jianzhu Ma,et al.  ModuleAlign: module-based global alignment of protein-protein interaction networks , 2016, Bioinform..

[10]  Robert Patro,et al.  Global network alignment using multiscale spectral signatures , 2012, Bioinform..

[11]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[12]  Ahmet Emre Aladag,et al.  SPINAL: scalable protein interaction network alignment , 2013, Bioinform..

[13]  Kristina Ban,et al.  Unified Alignment of Protein-Protein Interaction Networks , 2017, Scientific Reports.

[14]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[15]  Haiyuan Yu,et al.  HINT: High-quality protein interactomes and their applications in understanding human disease , 2012, BMC Systems Biology.

[16]  Matthias Grossglauser,et al.  PROPER: global protein interaction network alignment through percolation matching , 2016, BMC Bioinformatics.

[17]  Behnam Neyshabur,et al.  NETAL: a new graph-based method for global alignment of protein-protein interaction networks , 2013, Bioinform..

[18]  Mark Gerstein,et al.  The Importance of Bottlenecks in Protein Networks: Correlation with Gene Essentiality and Expression Dynamics , 2007, PLoS Comput. Biol..

[19]  Osmar Norberto de Souza,et al.  Protein Structure, Modelling and Applications , 2007 .

[20]  Zheng Wang,et al.  GOGO: An improved algorithm to measure the semantic similarity between gene ontology terms , 2018, Scientific Reports.

[21]  Pietro Hiram Guzzi,et al.  Survey of local and global biological network alignment: the need to reconcile the two sides of the same coin , 2017, Briefings Bioinform..

[22]  Natasa Przulj,et al.  L-GRAAL: Lagrangian graphlet-based network aligner , 2015, Bioinform..

[23]  Xianglong Tang,et al.  Predicting Disease-Related Proteins Based on Clique Backbone in Protein-Protein Interaction Network , 2014, International journal of biological sciences.

[24]  Jugal Kalita,et al.  Index-Based Network Aligner of Protein-Protein Interaction Networks , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  Thomas Lengauer,et al.  A new measure for functional similarity of gene products based on Gene Ontology , 2006, BMC Bioinformatics.

[26]  Philip S. Yu,et al.  G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery , 2009, Nucleic Acids Res..

[27]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[28]  Hu Ding,et al.  Protein Mover's Distance: A Geometric Framework for Solving Global Alignment of PPI Networks , 2017, COCOA.

[29]  Michael J. E. Sternberg,et al.  PINALOG: a novel approach to align protein interaction networks—implications for complex detection and function prediction , 2012, Bioinform..

[30]  Daisuke Kihara,et al.  NaviGO: interactive tool for visualization and functional similarity and coherence analysis with gene ontology , 2017, BMC Bioinformatics.

[31]  Martin C. Herbordt,et al.  Fast and accurate NCBI BLASTP: acceleration with multiphase FPGA-based prefiltering , 2010, ICS '10.

[32]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[33]  Cheng-Yu Ma,et al.  Optimizing a global alignment of protein interaction networks , 2013, Bioinform..

[34]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[35]  Tijana Milenkovic,et al.  MAGNA++: Maximizing Accuracy in Global Network Alignment via both node and edge conservation , 2015, Bioinform..

[36]  Yi Pan,et al.  Protein-protein interactions: detection, reliability assessment and applications , 2016, Briefings Bioinform..

[37]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[38]  A. Weber,et al.  Arabidopsis species deploy distinct strategies to cope with drought stress , 2018, bioRxiv.