Applied Graph-Mining Algorithms to Study Biomolecular Interaction Networks

Protein-protein interaction (PPI) networks carry vital information on the organization of molecular interactions in cellular systems. The identification of functionally relevant modules in PPI networks is one of the most important applications of biological network analysis. Computational analysis is becoming an indispensable tool to understand large-scale biomolecular interaction networks. Several types of computational methods have been developed and employed for the analysis of PPI networks. Of these computational methods, graph comparison and module detection are the two most commonly used strategies. This review summarizes current literature on graph kernel and graph alignment methods for graph comparison strategies, as well as module detection approaches including seed-and-extend, hierarchical clustering, optimization-based, probabilistic, and frequent subgraph methods. Herein, we provide a comprehensive review of the major algorithms employed under each theme, including our recently published frequent subgraph method, for detecting functional modules commonly shared across multiple cancer PPI networks.

[1]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[2]  Chris H. Q. Ding,et al.  Determining modular organization of protein interaction networks by maximizing modularity density , 2010, BMC Systems Biology.

[3]  Kaushal K. Shukla,et al.  COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DISTRIBUTION , 2012 .

[4]  Jan Ramon,et al.  Expressivity versus efficiency of graph kernels , 2003 .

[5]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[6]  D. Bu,et al.  the protein–protein interaction network , 2004 .

[7]  Fred W. Glover,et al.  Tabu Search - Part I , 1989, INFORMS J. Comput..

[8]  David Martin,et al.  Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network , 2003, Genome Biology.

[9]  Natasa Przulj,et al.  Integrative network alignment reveals large regions of global network similarity in yeast and human , 2011, Bioinform..

[10]  M. Porter,et al.  Critical Truths About Power Laws , 2012, Science.

[11]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Roded Sharan,et al.  Fast and Accurate Alignment of Multiple Protein Networks , 2009, J. Comput. Biol..

[13]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[14]  Frans Coenen,et al.  A survey of frequent subgraph mining algorithms , 2012, The Knowledge Engineering Review.

[15]  Wojciech Szpankowski,et al.  Pairwise Alignment of Protein Interaction Networks , 2006, J. Comput. Biol..

[16]  Vasant Honavar,et al.  Aligning Biomolecular Networks Using Modular Graph Kernels , 2009, WABI.

[17]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[18]  Yanjun Qi,et al.  Protein complex identification by supervised graph local clustering , 2008, ISMB.

[19]  Anastasios Bezerianos,et al.  Growing functional modules from a seed protein via integration of protein interaction and gene expression data , 2007, BMC Bioinformatics.

[20]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[21]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[22]  Behnam Neyshabur,et al.  NETAL: a new graph-based method for global alignment of protein-protein interaction networks , 2013, Bioinform..

[23]  Serafim Batzoglou,et al.  Automatic Parameter Learning for Multiple Local Network Alignment , 2009, J. Comput. Biol..

[24]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[25]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[26]  George Karypis,et al.  An efficient algorithm for discovering frequent subgraphs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[27]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[28]  Antal F. Novak,et al.  networks Græmlin : General and robust alignment of multiple large interaction data , 2006 .

[29]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Xiujuan Lei,et al.  Protein complex detection with semi-supervised learning in protein interaction networks , 2011, Proteome Science.

[31]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[32]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[33]  Wayne Hayes,et al.  Optimal Network Alignment with Graphlet Degree Vectors , 2010, Cancer informatics.

[34]  Robert Patro,et al.  Global network alignment using multiscale spectral signatures , 2012, Bioinform..

[35]  Tijana Milenkovic,et al.  Graphlet-based edge clustering reveals pathogen-interacting proteins , 2012, Bioinform..

[36]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[37]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[38]  Ru Shen,et al.  Mining functional subgraphs from cancer protein-protein interaction networks , 2012, BMC Systems Biology.

[39]  Peng Jiang,et al.  SPICi: a fast clustering algorithm for large biological networks , 2010, Bioinform..

[40]  Roded Sharan,et al.  PathBLAST: a tool for alignment of protein interaction networks , 2004, Nucleic Acids Res..

[41]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[42]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[43]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[44]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[45]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[46]  Gabriel Valiente,et al.  A graph distance metric combining maximum common subgraph and minimum common supergraph , 2001, Pattern Recognit. Lett..

[47]  Karsten M. Borgwardt,et al.  Fast subtree kernels on graphs , 2009, NIPS.

[48]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[49]  Yong Wang,et al.  Alignment of Protein Interaction Networks by Integer Quadratic Programming , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[50]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[51]  Bonnie Berger,et al.  IsoRankN: spectral methods for global alignment of multiple protein networks , 2009, Bioinform..

[52]  B MolerCleve,et al.  Solution of the Sylvester matrix equation AXBT + CXDT = E , 1992 .

[53]  S. Dongen Graph clustering by flow simulation , 2000 .

[54]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[55]  Vesna Memisevic,et al.  Global G RAph A Lignment of Biological Networks , 2022 .

[56]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[57]  Roded Sharan,et al.  NetworkBLAST: comparative analysis of protein networks , 2008 .

[58]  Alan J. Laub,et al.  Solution of the Sylvester matrix equation AXBT + CXDT = E , 1992, TOMS.

[59]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[60]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Ben Shneiderman,et al.  Interactive color mosaic and dendrogram displays for signal/noise optimization in microarray data analysis , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[62]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[63]  Hans-Peter Kriegel,et al.  Graph Kernels For Disease Outcome Prediction From Protein-Protein Interaction Networks , 2006, Pacific Symposium on Biocomputing.

[64]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[65]  Eli Upfal,et al.  Algorithms for Detecting Significantly Mutated Pathways in Cancer , 2010, RECOMB.

[66]  Jean-Philippe Vert,et al.  Graph kernels based on tree patterns for molecules , 2006, Machine Learning.

[67]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[68]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[69]  An-Yuan Guo,et al.  A Novel microRNA and transcription factor mediated regulatory network in schizophrenia , 2010, BMC Systems Biology.

[70]  O. Kuchaiev,et al.  Topological network alignment uncovers biological function and phylogeny , 2008, Journal of The Royal Society Interface.

[71]  R. Altman,et al.  Whole-genome expression analysis: challenges beyond clustering. , 2001, Current opinion in structural biology.

[72]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.