Querying Graphs in Protein-Protein Interactions Networks Using Feedback Vertex Set

Recent techniques increase rapidly the amount of our knowledge on interactions between proteins. The interpretation of these new information depends on our ability to retrieve known substructures in the data, the Protein-Protein Interactions (PPIs) networks. In an algorithmic point of view, it is an hard task since it often leads to NP-hard problems. To overcome this difficulty, many authors have provided tools for querying patterns with a restricted topology, i.e., paths or trees in PPI networks. Such restriction leads to the development of fixed parameter tractable (FPT) algorithms, which can be practicable for restricted sizes of queries. Unfortunately, Graph Homomorphism is a W[1]-hard problem, and hence, no FPT algorithm can be found when patterns are in the shape of general graphs. However, Dost et al. [2] gave an algorithm (which is not implemented) to query graphs with a bounded treewidth in PPI networks (the treewidth of the query being involved in the time complexity). In this paper, we propose another algorithm for querying pattern in the shape of graphs, also based on dynamic programming and the color-coding technique. To transform graphs queries into trees without loss of informations, we use feedback vertex set coupled to a node duplication mechanism. Hence, our algorithm is FPT for querying graphs with a bounded size of their feedback vertex set. It gives an alternative to the treewidth parameter, which can be better or worst for a given query. We provide a python implementation which allows us to validate our implementation on real data. Especially, we retrieve some human queries in the shape of graphs into the fly PPI network.

[1]  Marie-France Sagot,et al.  Assessing the Exceptionality of Coloured Motifs in Networks , 2008, EURASIP J. Bioinform. Syst. Biol..

[2]  Roded Sharan,et al.  Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks , 2006, J. Comput. Biol..

[3]  Roded Sharan,et al.  QNet: A Tool for Querying Protein Interaction Networks , 2007, RECOMB.

[4]  Guillaume Blin,et al.  Querying Protein-Protein Interaction Networks , 2009, ISBRA.

[5]  Michael R. Fellows,et al.  Sharp Tractability Borderlines for Finding Connected Motifs in Vertex-Colored Graphs , 2007, ICALP.

[6]  Guillaume Blin,et al.  GraMoFoNe: a Cytoscape Plugin for Querying Motifs without Topology in Protein-Protein Interactions Networks , 2010, BICoB.

[7]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[8]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[9]  Stéphan Thomassé A quadratic kernel for feedback vertex set , 2009, SODA.

[10]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Richard M. Karp,et al.  Reducibility among combinatorial problems" in complexity of computer computations , 1972 .

[12]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[13]  T. Ideker,et al.  Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae , 2006, Journal of biology.

[14]  Hans L. Bodlaender,et al.  A Tourist Guide through Treewidth , 1993, Acta Cybern..

[15]  Paul Dent,et al.  MAPK pathways in radiation responses , 2003, Oncogene.

[16]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[17]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[18]  Robert W. Harrison,et al.  Fast Alignments of Metabolic Networks , 2008, 2008 IEEE International Conference on Bioinformatics and Biomedicine.

[19]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[21]  Arie M. C. A. Koster,et al.  Combinatorial Optimization on Graphs of Bounded Treewidth , 2008, Comput. J..

[22]  Riccardo Dondi,et al.  Maximum Motif Problem in Vertex-Colored Graphs , 2009, CPM.

[23]  Denis R. Hirschfeldt,et al.  Parameterized complexity: new developments and research frontiers , 2001 .

[24]  Mam Riess Jones Color Coding , 1962, Human factors.

[25]  Roded Sharan,et al.  QPath: a method for querying pathways in a protein-protein interaction network , 2006, BMC Bioinformatics.

[26]  Christian Komusiewicz,et al.  Parameterized Algorithms and Hardness Results for Some Graph Motif Problems , 2008, CPM.

[27]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[29]  Rolf Niedermeier,et al.  Compression-based fixed-parameter algorithms for feedback vertex set and edge bipartization , 2006, J. Comput. Syst. Sci..

[30]  Roded Sharan,et al.  Topology-Free Querying of Protein Interaction Networks , 2009, RECOMB.

[31]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[32]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[33]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[34]  Thomas Zichner,et al.  Algorithm Engineering for Color-Coding to Facilitate Signaling Pathway Detection , 2007, APBC.

[35]  Riccardo Dondi,et al.  Weak pattern matching in colored graphs: Minimizing the number of connected components , 2007, ICTCS.

[36]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.