Link and Node Prediction in Metabolic Networks with Probabilistic Logic

Information on metabolic processes for hundreds of organisms is available in public databases. However, this information is often incomplete or affected by uncertainty. Systems capable to perform automatic curation of these databases and capable to suggest pathway-holes fillings are therefore needed. To this end such systems should exploit data available from related organisms and cope with heterogeneous sources of information (e.g. phylogenetic relations). Here we start to investigate two fundamental problems concerning automatic metabolic networks curation, namely link prediction and node prediction using ProbLog, a simple yet powerful extension of the logic programming language Prolog with independent random variables.

[1]  Luc De Raedt,et al.  On the implementation of the probabilistic logic programming language ProbLog , 2010, Theory and Practice of Logic Programming.

[2]  Luc De Raedt,et al.  On the Efficient Execution of ProbLog Programs , 2008, ICLP.

[3]  Luc De Raedt,et al.  Parameter Learning in Probabilistic Databases: A Least Squares Approach , 2008, ECML/PKDD.

[4]  Petter Holme,et al.  Currency and commodity metabolites: their identification and relation to the modularity of metabolic networks. , 2006, IET systems biology.

[5]  Akiyasu C. Yoshizawa,et al.  KAAS: an automatic genome annotation and pathway reconstruction server , 2007, Environmental health perspectives.

[6]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[7]  E. Webb Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. , 1992 .

[8]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[9]  Peter D. Karp,et al.  A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases , 2004, BMC Bioinformatics.

[10]  D. Fell,et al.  The small world inside large metabolic networks , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[11]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[12]  A. Arkin,et al.  Biological networks. , 2003, Current opinion in structural biology.

[13]  Yuki Moriya,et al.  Automatic generation of KEGG OC ( Ortholog Cluster ) and its assignment to draft genomes , 2004 .

[14]  Tobias Kötter,et al.  From Information Networks to Bisociative Information Networks , 2012, Bisociative Knowledge Discovery.

[15]  Peter D. Karp,et al.  A survey of orphan enzyme activities , 2007, BMC Bioinformatics.

[16]  C. Ouzounis,et al.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes , 2005, Nucleic acids research.