Highlighting Metabolic Strategies Using Network Analysis over Strain Optimization Results

The field of Metabolic Engineering has been growing, supported by the increase in the number of annotated genomes and genomescale metabolic models. In silico strain optimization methods allow to create mutant strains able to overproduce certain metabolites of interest in Biotechnology. Thus, it is possible to reach (near-) optimal solutions, i.e. strains that provide the desired phenotype in computational phenotype simulations. However, the validation of the results involves understanding the strategies followed by these mutant strains to achieve the desired phenotype, studying the different use of reactions/ pathways by the mutants. This is quite complex given the size of the networks and the interactions between (sometimes distant) components. The manual verification and comparison of phenotypes is typically impossible. Here, automatic methods are proposed to analyse large sets of mutant strains, by taking the phenotypes of a large number of possible solutions and identifying shared patterns, using methods from network topology analysis. The topological comparison between the networks provided by the wild type and mutant strains highlights the major changes that lead to successful mutants. The methods are applied to a case study considering E. coli and aiming at the production of succinate, optimizing the set of gene knockouts to apply to the wild type. Solutions provided by the use of Simulated Annealing and Evolutionary Algorithms are analyzed. The results show that these methods can help in the identification of the strategies leading to the overproduction of succinate.

[1]  Susan R. Wessler,et al.  MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences , 2010, Nucleic acids research.

[2]  M. Ragan Phylogenetic inference based on matrix representation of trees. , 1992, Molecular phylogenetics and evolution.

[3]  Tandy J. Warnow,et al.  A simulation study comparing supertree and combined analysis methods using SMIDGen , 2009, Algorithms for Molecular Biology.

[4]  J. Mullikin,et al.  SSAHA: a fast search method for large DNA databases. , 2001, Genome research.

[5]  S. Lee,et al.  In silico metabolic pathway analysis and design: succinic acid production by metabolically engineered Escherichia coli as an example. , 2002, Genome informatics. International Conference on Genome Informatics.

[6]  Miguel Rocha,et al.  OptFlux: an open-source software platform for in silico metabolic engineering , 2010, BMC Systems Biology.

[7]  Joel Arrais,et al.  Large Scale Comparative Codon-Pair Context Analysis Unveils General Rules that Fine-Tune Evolution of mRNA Primary Structure , 2007, PloS one.

[8]  M. Saier A Functional-Phylogenetic Classification System for Transmembrane Solute Transporters , 2000, Microbiology and Molecular Biology Reviews.

[9]  Dietrich Rebholz-Schuhmann,et al.  Identification of Chemical Entities in Patent Documents , 2009, IWANN.

[10]  Manuel A. S. Santos,et al.  Evolution of pathogenicity and sexual reproduction in eight Candida genomes , 2009, Nature.

[11]  S. Eddy,et al.  Automated de novo identification of repeat sequence families in sequenced genomes. , 2002, Genome research.

[12]  David Fernández-Baca,et al.  Robinson-Foulds Supertrees , 2010, Algorithms for Molecular Biology.

[13]  José Luís Oliveira,et al.  Towards knowledge federation in biomedical applications , 2011, I-Semantics '11.

[14]  John Walchli,et al.  Combined protein construct and synthetic gene engineering for heterologous protein expression and crystallization using Gene Composer , 2009, BMC biotechnology.

[15]  Zhao Xu,et al.  LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons , 2007, Nucleic Acids Res..

[16]  D. Hartl,et al.  Accelerated evolution of resistance in multidrug environments , 2008, Proceedings of the National Academy of Sciences.

[17]  S. Bridges,et al.  Computational Approaches and Tools Used in Identification of Dispersed Repetitive DNA Sequences , 2008, Tropical Plant Biology.

[18]  Adam M. Feist,et al.  Reconstruction of biochemical networks in microorganisms , 2009, Nature Reviews Microbiology.

[19]  Bernhard O. Palsson,et al.  Connecting Extracellular Metabolomic Measurements to Intracellular Flux States in Yeast , 2022 .

[20]  Serita M. Nelesen,et al.  SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees. , 2012, Systematic biology.

[21]  B. Palsson,et al.  An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR) , 2003, Genome Biology.

[22]  Gregory R. Madey,et al.  An automated homology-based approach for identifying transposable elements , 2011, BMC Bioinformatics.

[23]  Faraz Hach,et al.  Next-generation VariationHunter: combinatorial algorithms for transposon insertion discovery , 2010, Bioinform..

[24]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[25]  M A Roĭtberg,et al.  [Pareto-optimal alignment of biological sequences]. , 1999, Biofizika.

[26]  Liliana Sofia Mendonça Cardoso Classificação de genes em hibridação genómica comparativa de estirpes de Streptococcus pneumoniae , 2009 .

[27]  M. Steel The complexity of reconstructing trees from qualitative characters and subtrees , 1992 .

[28]  Paul Horton,et al.  Nucleic Acids Research Advance Access published May 21, 2007 WoLF PSORT: protein localization predictor , 2007 .

[29]  A. Dillmann Enzyme Nomenclature , 1965, Nature.

[30]  Tandy Warnow,et al.  SuperFine: fast and accurate supertree estimation. , 2012, Systematic biology.

[31]  R. Edwards,et al.  The Phage Proteomic Tree: a Genome-Based Taxonomy for Phage , 2002, Journal of bacteriology.

[32]  E. Hall,et al.  The nature of biotechnology. , 1988, Journal of biomedical engineering.

[33]  K. Nixon,et al.  The Parsimony Ratchet, a New Method for Rapid Parsimony Analysis , 1999, Cladistics : the international journal of the Willi Hennig Society.

[34]  Pedro Lopes,et al.  A semantic web application framework for health systems interoperability , 2011, MIXHS '11.

[35]  C. Maranas,et al.  Zea mays iRS1563: A Comprehensive Genome-Scale Metabolic Reconstruction of Maize Metabolism , 2011, PloS one.

[36]  L. Wackett An annotated selection of World Wide Web sites relevant to the topics in Microbial Biotechnology , 2013, Microbial biotechnology.

[37]  Eric Arnoult,et al.  The challenge of new drug discovery for tuberculosis , 2011, Nature.

[38]  Markus J. Herrgård,et al.  A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology , 2008, Nature Biotechnology.

[39]  Casey M. Bergman,et al.  Discovering and detecting transposable elements in genome sequences , 2007, Briefings Bioinform..

[40]  GnanaSundar Rajendiran,et al.  Clustering Method for Repeat Analysis in DNA sequences , 2008 .

[41]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[42]  Adam M. Feist,et al.  The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli , 2008, Nature Biotechnology.

[43]  B. Palsson,et al.  Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth , 2002, Nature.

[44]  Alan Villalobos,et al.  Design Parameters to Control Synthetic Gene Expression in Escherichia coli , 2009, PloS one.

[45]  Kenneth J. Kauffman,et al.  Advances in flux balance analysis. , 2003, Current opinion in biotechnology.

[46]  Dietrich Rebholz-Schuhmann,et al.  Text processing through Web services: calling Whatizit , 2008, Bioinform..

[47]  O. Bininda-Emonds Phylogenetic Supertrees: Combining Information To Reveal The Tree Of Life , 2004 .

[48]  Peter F. Stadler,et al.  Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures , 2009, PLoS Comput. Biol..

[49]  Forest Rohwer,et al.  Here a virus, there a virus, everywhere the same virus? , 2005, Trends in microbiology.

[50]  C Saccone,et al.  Asymmetrical directional mutation pressure in the mitochondrial genome of mammals. , 1998, Molecular biology and evolution.

[51]  Mário Ramirez,et al.  Optimal control and analysis of two-color genomotyping experiments using bacterial multistrain arrays , 2008, BMC Genomics.

[52]  日本農芸化学会 Agricultural and biological chemistry , 1961 .

[53]  B. Baum Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees , 1992 .

[54]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[55]  Eugene W. Myers,et al.  PILER: identification and classification of genomic repeats , 2005, ISMB.

[56]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[57]  Tandy J. Warnow,et al.  An experimental study of Quartets MaxCut and other supertree methods , 2010, Algorithms for Molecular Biology.

[58]  B. Barrell,et al.  Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence , 1998, Nature.

[59]  Miguel Rocha,et al.  Natural computation meta-heuristics for the in silico optimization of microbial strains , 2008, BMC Bioinformatics.

[60]  日本農芸化学会 Bioscience, biotechnology, and biochemistry , 1992 .

[61]  Kiran Raosaheb Patil,et al.  Metabolic Network Topology Reveals Transcriptional Regulatory Signatures of Type 2 Diabetes , 2010, PLoS Comput. Biol..

[62]  João A Carriço,et al.  Analysis of Invasiveness of Pneumococcal Serotypes and Clones Circulating in Portugal before Widespread Use of Conjugate Vaccines Reveals Heterogeneous Behavior of Clones Expressing the Same Serotype , 2011, Journal of Clinical Microbiology.

[63]  Haixu Tang,et al.  MGEScan-non-LTR: computational identification and classification of autonomous non-LTR retrotransposons in eukaryotic genomes , 2009, Nucleic acids research.

[64]  Marcus Droege,et al.  The Genome Sequencer FLX System--longer reads, more applications, straight forward bioinformatics and more complete data sets. , 2008, Journal of biotechnology.

[65]  Dominique Lavenier,et al.  GASSST: global alignment short sequence search tool , 2010, Bioinform..

[66]  Jens Nielsen,et al.  Evolutionary programming as a platform for in silico metabolic engineering , 2005, BMC Bioinformatics.

[67]  R. Rajamony,et al.  References 1 , 1961 .

[68]  Jianqing Fan,et al.  Fast implementations of nonparametric curve estimators , 1993 .

[69]  Jeremiah J Faith,et al.  Likelihood analysis of asymmetrical mutation bias gradients in vertebrate mitochondrial genomes. , 2003, Genetics.

[70]  Stefan Niemann,et al.  Genomic Analysis Distinguishes Mycobacterium africanum , 2004, Journal of Clinical Microbiology.

[71]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.