ProbAnnoWeb and ProbAnnoPy: probabilistic annotation and gap-filling of metabolic reconstructions

Summary Gap-filling is a necessary step to produce quality genome-scale metabolic reconstructions capable of flux-balance simulation. Most available gap-filling tools use an organism-agnostic approach, where reactions are selected from a database to fill gaps without consideration of the target organism. Conversely, our likelihood based gap-filling with probabilistic annotations selects candidate reactions based on a likelihood score derived specifically from the target organism's genome. Here, we present two new implementations of probabilistic annotation and likelihood based gap-filling: a web service called ProbAnnoWeb, and a standalone python package called ProbAnnoPy. Availability and implementation Our tools are available as a web service with no installation needed (ProbAnnoWeb) at probannoweb.systemsbiology.net, and as a local python package implementation (ProbAnnoPy) at github.com/PriceLab/probannopy. Contact evangelos.simeonidis@systemsbiology.org or nathan.price@systemsbiology.org. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Jason A. Papin,et al.  Applications of genome-scale metabolic reconstructions , 2009, Molecular systems biology.

[2]  B. Palsson,et al.  Systems approach to refining genome annotation , 2006, Proceedings of the National Academy of Sciences.

[3]  C. Ouzounis,et al.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes , 2005, Nucleic acids research.

[4]  Philip Miller,et al.  BiGG Models: A platform for integrating, standardizing and sharing genome-scale models , 2015, Nucleic Acids Res..

[5]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[6]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[7]  B. Palsson,et al.  A protocol for generating a high-quality genome-scale metabolic reconstruction , 2010 .

[8]  R. Overbeek,et al.  Automated genome annotation and metabolic model reconstruction in the SEED and Model SEED. , 2013, Methods in molecular biology.

[9]  Yoshihiro Yamanishi,et al.  KEGG for linking genomes to life and the environment , 2007, Nucleic Acids Res..

[10]  Joshua A. Lerman,et al.  COBRApy: COnstraints-Based Reconstruction and Analysis for Python , 2013, BMC Systems Biology.

[11]  Naryttza N. Diaz,et al.  The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes , 2005, Nucleic acids research.

[12]  Nathan D. Price,et al.  Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models , 2014, PLoS Comput. Biol..

[13]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[14]  James A. Eddy,et al.  Accomplishments in genome‐scale in silico modeling for industrial and medical biotechnology , 2009, Biotechnology journal.

[15]  Michael B. Mundy,et al.  Mackinac: a bridge between ModelSEED and COBRApy to generate and analyze genome-scale metabolic models , 2017, Bioinform..

[16]  R. Heinrich,et al.  Metabolic Pathway Analysis: Basic Concepts and Scientific Applications in the Post‐genomic Era , 1999, Biotechnology progress.

[17]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.