Automatic comparison of metabolites names: impact of criteria thresholds

The growing number of stoichiometric reconstructions and models tends to change the model building process. Instead of creating a new model from scratch scientists can look at the earlier created relevant models to assess the opinion and consensus level of other modellers. Several initiatives have been performed to build consensus models for particular organisms following this approach. One of possible improvements in the model development taking into account earlier developed ones is automated comparison of models. That is enabled by the fact that models usually are in a computer readable format to be simulated. Still there are some problems like different ways of naming metabolites, models without formulae of metabolites, different approach in definition of compartments and other peculiarities of different research groups at different times. There are several software tools that offer reconciliation or mapping of metabolites in models to assess the similarity of models. It is computationally trivial to find metabolite pairs with identical names in two models. Still very often the comparison algorithms lack flexibility and some work should be done by manual curation of metabolite pairs to recognize that the difference is caused by symbols like brackets, quotes, apostrophes, spaces, upper/lower case letters or similar ones. The proposed approach suggests combination of automated comparison with manual curation: most of possible metabolite pairs are rejected by computer leaving just the most similar metabolite pairs for manual comparison. The elasticity in metabolite name comparison is introduced using Levenstein similarity ratio and Levenstein edit distance. Application of these criteria with different acceptance thresholds is analyzed comparing two models of Saccaromyces cerevisiae with 681 and 1063 metabolites. The results are compared with manually approved pairs of matching metabolites.

[1]  Ivars Mozga,et al.  ConvAn: A convergence analyzing tool for optimization of biochemical networks , 2012, Biosyst..

[2]  Ivars Mozga,et al.  Convergence dynamics of biochemical pathway steady state stochastic global optimization , 2011, 2011 IEEE 12th International Symposium on Computational Intelligence and Informatics (CINTI).

[3]  Egils Stalidzans,et al.  Biotechnological potential of respiring Zymomonas mobilis: a stoichiometric analysis of its central metabolism. , 2013, Journal of biotechnology.

[4]  U. Sauer,et al.  Metabolic functions of duplicate genes in Saccharomyces cerevisiae. , 2005, Genome research.

[5]  Jacky L. Snoep,et al.  BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems , 2005, Nucleic Acids Res..

[6]  F. Srienc,et al.  Trace: Tennessee Research and Creative Exchange Metabolic Engineering of Escherichia Coli for Efficient Conversion of Glycerol into Ethanol , 2022 .

[7]  D. Fell Metabolic control analysis: a survey of its theoretical and experimental development. , 1992, The Biochemical journal.

[8]  B. Martin PARAMETER ESTIMATION , 2012, Statistical Methods for Biomedical Research.

[9]  Markus J. Herrgård,et al.  Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. , 2004, Genome research.

[10]  Friedrich Srienc,et al.  Rational design and construction of an efficient E. coli for production of diapolycopendioic acid. , 2010, Metabolic engineering.

[11]  F Rodríguez-Acosta,et al.  Non-linear optimization of biotechnological processes by stochastic algorithms: application to the maximization of the production rate of ethanol, glycerol and carbohydrates by Saccharomyces cerevisiae. , 1999, Journal of biotechnology.

[12]  B. Palsson,et al.  A protocol for generating a high-quality genome-scale metabolic reconstruction , 2010 .

[13]  B. Palsson Systems Biology: Properties of Reconstructed Networks , 2006 .

[14]  J. Bailey,et al.  Fermentation pathway kinetics and metabolic flux control in suspended and immobilized Saccharomyces cerevisiae , 1990 .

[15]  F. Hynne,et al.  Full-scale model of glycolysis in Saccharomyces cerevisiae. , 2001, Biophysical chemistry.

[16]  H. Kacser,et al.  A universal method for achieving increases in metabolite production. , 1993, European journal of biochemistry.

[17]  Jean-Charles Portais,et al.  In silico strategy to rationally engineer metabolite production: A case study for threonine in Escherichia coli , 2009, Biotechnology and bioengineering.

[18]  Rainer Breitling,et al.  What is Systems Biology? , 2010, Front. Physiology.

[19]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[20]  Edda Klipp,et al.  Annotation and merging of SBML models with semanticSBML , 2010, Bioinform..

[21]  Sven Sahle,et al.  Applications and trends in systems biology in biochemistry , 2011, The FEBS journal.

[22]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.

[23]  V. Thorsson,et al.  Integrated Genomic and Proteomic Analyses of Gene Expression in Mammalian Cells*S , 2004, Molecular & Cellular Proteomics.

[24]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[25]  M. Mednis,et al.  Comparison of genome-scale reconstructions using ModeRator , 2012, 2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI).

[26]  François Fages,et al.  A graphical method for reducing and relating models in systems biology , 2010, Bioinform..

[27]  A. C. Hoffmann,et al.  AIChE Symposium Series , 1999 .

[28]  J. Sulins,et al.  Automatic termination of parallel optimization runs of stochastic global optimization methods in consensus or stagnation cases , 2012 .

[29]  V Hatzimanikatis,et al.  Nonlinear metabolic control analysis. , 1999, Metabolic engineering.

[30]  Soha Hassoun,et al.  Probabilistic strain optimization under constraint uncertainty , 2013, BMC Systems Biology.

[31]  Maria Rodriguez-Fernandez,et al.  A hybrid approach for efficient and robust parameter estimation in biochemical pathways. , 2006, Bio Systems.

[32]  G. Stephanopoulos,et al.  Flux amplification in complex metabolic networks , 1997 .

[33]  Sven Sahle,et al.  Computational modeling of biochemical networks using COPASI. , 2009, Methods in molecular biology.

[34]  Julio R. Banga,et al.  Optimization in computational systems biology , 2008, BMC Systems Biology.

[35]  Ralf Takors,et al.  The identification of enzyme targets for the optimization of a valine producing Corynebacterium glutamicum strain using a kinetic model , 2009, Biotechnology progress.

[36]  Modular metabolic control analysis of large responses , 2007, The FEBS journal.

[37]  Joost Boele,et al.  FAME, the Flux Analysis and Modeling Environment , 2012, BMC Systems Biology.

[38]  Vytautas Galvanauskas,et al.  ModeRator - a software tool for comparison of stoichiometric models , 2012, 2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI).

[39]  Maike K. Aurich,et al.  Application of string similarity ratio and edit distance in automatic metabolite reconciliation comparing reconstructions and models , 2012 .

[40]  J. Schmid,et al.  Computer-Aided Design of Metabolic Networks , 2002 .

[41]  Alexander N. Gorban,et al.  Robust simplifications of multiscale biochemical networks , 2008, BMC Systems Biology.

[42]  Ivars Mozga,et al.  Convergence dynamics of biochemical models to the global optimum , 2011, 2011 E-Health and Bioengineering Conference (EHB).

[43]  Edda Klipp,et al.  SBMLmerge, a system for combining biochemical network models. , 2006, Genome informatics. International Conference on Genome Informatics.

[44]  Carmen G. Moles,et al.  Parameter estimation in biochemical pathways: a comparison of global optimization methods. , 2003, Genome research.

[45]  Douglas B. Kell,et al.  Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation , 1998, Bioinform..

[46]  E. Stalidzans,et al.  Two stage optimization of biochemical pathways using parallel runs of global stochastic optimization methods , 2012, 2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI).

[47]  E. V. Nikolaev,et al.  The elucidation of metabolic pathways and their improvements using stable optimization of large-scale kinetic models of cellular systems. , 2010, Metabolic engineering.

[48]  Jason A. Papin,et al.  Reconciliation of Genome-Scale Metabolic Reconstructions for Comparative Systems Analysis , 2011, PLoS Comput. Biol..

[49]  Roger E Bumgarner,et al.  Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. , 2001, Science.

[50]  Ronan M. T. Fleming,et al.  A community-driven global reconstruction of human metabolism , 2013, Nature Biotechnology.

[51]  C. Maranas,et al.  An optimization framework for identifying reaction activation/inhibition or elimination candidates for overproduction in microbial systems. , 2006, Metabolic engineering.

[52]  Ronan M. T. Fleming,et al.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0 , 2007, Nature Protocols.