Algorithms for Regular Tree Grammar Network Search and Their Application to Mining Human-viral Infection Patterns

Network querying is a powerful approach to mine molecular interaction networks. Most state-of-the-art network querying tools either confine the search to a prespecified topology in the form of some template subnetwork, or do not specify any topological constraints at all. Another approach is grammar-based queries, which are more flexible and expressive as they allow for expressing the topology of the sought pattern according to some grammar-based logic. Previous grammar-based network querying tools were confined to the identification of paths. In this article, we extend the patterns identified by grammar-based query approaches from paths to trees. For this, we adopt a higher order query descriptor in the form of a regular tree grammar (RTG). We introduce a novel problem and propose an algorithm to search a given graph for the k highest scoring subgraphs matching a tree accepted by an RTG. Our algorithm is based on the combination of dynamic programming with color coding, and includes an extension of previous k-best parsing optimization approaches to avoid isomorphic trees in the output. We implement the new algorithm and exemplify its application to mining viral infection patterns within molecular interaction networks. Our code is available online.

[1]  Giulio Superti-Furga,et al.  Structural basis for viral 5′-PPP-RNA recognition by human IFIT proteins , 2013, Nature.

[2]  A V Finkelstein,et al.  Computation of biopolymers: a general approach to different problems. , 1993, Bio Systems.

[3]  C. K. Madhusoodhanan,et al.  Study on economic contributions of state owned temples in Kerala , 2015 .

[4]  Steven Gygi,et al.  Human ISG15 conjugation targets both IFN-induced and constitutively expressed proteins functioning in diverse cellular pathways. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[5]  David Chiang,et al.  Better k-best Parsing , 2005, IWPT.

[6]  Osamu Takeuchi,et al.  TRIM25 RING-finger E3 ubiquitin ligase is essential for RIG-I-mediated antiviral activity , 2007, Nature.

[7]  Roded Sharan,et al.  QPath: a method for querying pathways in a protein-protein interaction network , 2006, BMC Bioinformatics.

[8]  Ron Y. Pinter,et al.  Alignment of metabolic pathways , 2005, Bioinform..

[9]  Joachim Niehren,et al.  Minimizing Tree Automata for Unranked Trees , 2005, DBPL.

[10]  Gabriel Pineda,et al.  Activation of IKK by TNFalpha requires site-specific ubiquitination of RIP1 and polyubiquitin binding by NEMO. , 2006, Molecular cell.

[11]  Walter Fiers,et al.  Interferon-Inducible Protein Mx1 Inhibits Influenza Virus by Interfering with Functional Viral Ribonucleoprotein Complex Assembly , 2012, Journal of Virology.

[12]  N. Hacohen,et al.  A Physical and Regulatory Map of Host-Influenza Interactions Reveals Pathways in H1N1 Infection , 2009, Cell.

[13]  Jianzhong Li,et al.  Adding regular expressions to graph reachability and pattern queries , 2011, ICDE 2011.

[14]  Ulf Leser,et al.  Regular Path Queries on Large Graphs , 2012, SSDBM.

[15]  Alberto O. Mendelzon,et al.  Finding Regular Simple Paths in Graph Databases , 1989, SIAM J. Comput..

[16]  Robert Patro,et al.  Predicting protein interactions via parsimonious network history inference , 2013, Bioinform..

[17]  Roded Sharan,et al.  Efficient Algorithms for Detecting Signaling Pathways in Protein Interaction Networks , 2006, J. Comput. Biol..

[18]  Petteri Sevon,et al.  Subgraph Queries by Context-free Grammars , 2008, J. Integr. Bioinform..

[19]  Roded Sharan,et al.  Topology-Free Querying of Protein Interaction Networks , 2009, RECOMB.

[20]  Hans Leiß,et al.  To CNF or not to CNF? An Efficient Yet Presentable Version of the CYK Algorithm , 2009, Informatica Didact..

[21]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[22]  Roded Sharan,et al.  QNet: A Tool for Querying Protein Interaction Networks , 2007, RECOMB.

[23]  R. Belshe,et al.  Implications of the emergence of a novel H1 influenza virus. , 2009, The New England journal of medicine.

[24]  Gaetano T. Montelione,et al.  Structural basis for the sequence-specific recognition of human ISG15 by the NS1 protein of influenza B virus , 2011, Proceedings of the National Academy of Sciences.

[25]  Yann Ponty,et al.  A Combinatorial Framework for Designing (Pseudoknotted) RNA Algorithms , 2011, WABI.

[26]  Hubert Comon,et al.  Tree automata techniques and applications , 1997 .

[27]  Hideyuki Konishi,et al.  UbcH8 regulates ubiquitin and ISG15 conjugation to RIG-I. , 2008, Molecular immunology.

[28]  James Henderson,et al.  Faster cube pruning , 2010, IWSLT.

[29]  Christoph H Emmerich,et al.  Recruitment of the linear ubiquitin chain assembly complex stabilizes the TNF-R1 signaling complex and is required for TNF-mediated gene induction. , 2009, Molecular cell.

[30]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[31]  Joachim Niehren,et al.  Querying Unranked Trees with Stepwise Tree Automata , 2004, RTA.

[32]  Noga Alon,et al.  Color-coding , 1995, JACM.

[33]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[34]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[35]  Susumu Goto,et al.  Data, information, knowledge and principle: back to metabolism in KEGG , 2013, Nucleic Acids Res..

[36]  Kazuhiro Iwai,et al.  Linear ubiquitin assembly complex negatively regulates RIG-I- and TRIM25-mediated type I interferon induction. , 2011, Molecular cell.

[37]  S. Inoue,et al.  Influenza A virus NS1 targets the ubiquitin ligase TRIM25 to evade recognition by the host viral RNA sensor RIG-I. , 2009, Cell host & microbe.

[38]  Christian Komusiewicz,et al.  Parameterized Algorithms and Hardness Results for Some Graph Motif Problems , 2008, CPM.

[39]  Ilan Y. Smoly,et al.  MyProteinNet: build up-to-date protein interaction networks for organisms, tissues and user-defined contexts , 2015, Nucleic Acids Res..