A study on multi-omic oscillations in Escherichia coli metabolic networks

BackgroundTwo important challenges in the analysis of molecular biology information are data (multi-omic information) integration and the detection of patterns across large scale molecular networks and sequences. They are are actually coupled beause the integration of omic information may provide better means to detect multi-omic patterns that could reveal multi-scale or emerging properties at the phenotype levels.ResultsHere we address the problem of integrating various types of molecular information (a large collection of gene expression and sequence data, codon usage and protein abundances) to analyse the E.coli metabolic response to treatments at the whole network level. Our algorithm, MORA (Multi-omic relations adjacency) is able to detect patterns which may represent metabolic network motifs at pathway and supra pathway levels which could hint at some functional role. We provide a description and insights on the algorithm by testing it on a large database of responses to antibiotics. Along with the algorithm MORA, a novel model for the analysis of oscillating multi-omics has been proposed. Interestingly, the resulting analysis suggests that some motifs reveal recurring oscillating or position variation patterns on multi-omics metabolic networks. Our framework, implemented in R, provides effective and friendly means to design intervention scenarios on real data. By analysing how multi-omics data build up multi-scale phenotypes, the software allows to compare and test metabolic models, design new pathways or redesign existing metabolic pathways and validate in silico metabolic models using nearby species.ConclusionsThe integration of multi-omic data reveals that E.coli multi-omic metabolic networks contain position dependent and recurring patterns which could provide clues of long range correlations in the bacterial genome.

[1]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[2]  P. Sharp,et al.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. , 1987, Nucleic acids research.

[3]  Q. Vuong Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .

[4]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[5]  J. Ludbrook SPECIAL ARTICLE COMPARING METHODS OF MEASUREMENT , 1997 .

[6]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[7]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[8]  J. Ramalho-Santos,et al.  Cronbach's alpha: a tool for assessing the reliability of scales , 1999 .

[9]  Eli Upfal,et al.  Balanced Allocations , 1999, SIAM J. Comput..

[10]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[11]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[12]  S. Thompson,et al.  Correcting for regression dilution bias: comparison of methods for a single predictor variable , 2000 .

[13]  Javier Tamames,et al.  Evolution of gene order conservation in prokaryotes , 2001, Genome Biology.

[14]  Philip S. Yu IEEE Transactions on Knowledge and Data Engineering: EIC Editorial , 2001 .

[15]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[16]  H. Miyasaka Translation Initiation AUG Context Varies with Codon Usage Bias and Gene Length in Drosophila melanogaster , 2002, Journal of Molecular Evolution.

[17]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[18]  Joao Antonio Pereira,et al.  Linked: The new science of networks , 2002 .

[19]  Albert-László Barabási,et al.  Linked: The New Science of Networks , 2002 .

[20]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[21]  R. Karp,et al.  Conserved pathways within bacteria and yeast as revealed by global protein network alignment , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Sara Light,et al.  Network analysis of metabolic enzyme evolution in Escherichia coli , 2004, BMC Bioinformatics.

[23]  Fan Chung Graham,et al.  Duplication Models for Biological Networks , 2002, J. Comput. Biol..

[24]  Alexander Isaev,et al.  PyEvolve: a toolkit for statistical modelling of molecular evolution , 2004, BMC Bioinformatics.

[25]  Tetsuro Toyoda,et al.  Omic space: coordinate-based integration and analysis of genomic phenomic interactions , 2004, Bioinform..

[26]  A Vázquez,et al.  The topological relationship between the large-scale attributes and local interaction patterns of complex networks , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Yen-Liang Chen,et al.  Mining sequential patterns from multidimensional sequence data , 2005, IEEE Transactions on Knowledge and Data Engineering.

[28]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[29]  Albert-László Barabási,et al.  The Activity Reaction Core and Plasticity of Metabolic Networks , 2005, PLoS Comput. Biol..

[30]  S. C. Rison,et al.  A universally applicable method of operon map prediction on minimally annotated genomes using conserved genomic context , 2005, Nucleic acids research.

[31]  Anna Georgieva,et al.  An integrated approach for inference and mechanistic modeling for advancing drug development , 2005, FEBS letters.

[32]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[33]  Jianzhi Zhang,et al.  Why Do Hubs Tend to Be Essential in Protein Networks? , 2006, PLoS genetics.

[34]  Stephen J. Callister,et al.  Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics. , 2006, Journal of proteome research.

[35]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[36]  Albert-László Barabási,et al.  Distribution of node characteristics in complex networks , 2007, Proceedings of the National Academy of Sciences.

[37]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[38]  Nagasuma R. Chandra,et al.  Metabolome Based Reaction Graphs of M. tuberculosis and M. leprae: A Comparative Network Analysis , 2007, PloS one.

[39]  E. Marcotte,et al.  Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation , 2007, Nature Biotechnology.

[40]  R. Grantham Codon Usage in Molecular Evolution , 2007 .

[41]  Dennis B. Troup,et al.  NCBI GEO: mining tens of millions of expression profiles—database and tools update , 2006, Nucleic Acids Res..

[42]  Sean R. Davis,et al.  GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor , 2007, Bioinform..

[43]  Steven G. Clarke,et al.  Computational methods to identify novel methyltransferases , 2009, BMC Bioinformatics.

[44]  Nicholas T. Ingolia,et al.  Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling , 2009, Science.

[45]  David Manset,et al.  XML-based approaches for the integration of heterogeneous bio-molecular data , 2009, BMC Bioinformatics.

[46]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[47]  Masaru Tomita,et al.  Multi-Omics Data-Driven Systems Biology of E. coli , 2009 .

[48]  Ying Xu,et al.  DOOR: a database for prokaryotic operons , 2008, Nucleic Acids Res..

[49]  Peter D. Karp,et al.  EcoCyc: A comprehensive view of Escherichia coli biology , 2008, Nucleic Acids Res..

[50]  Karsten Zengler,et al.  The transcription unit architecture of the Escherichia coli genome , 2009, Nature Biotechnology.

[51]  Griffin M. Weber,et al.  BioNumbers—the database of key numbers in molecular and cell biology , 2009, Nucleic Acids Res..

[52]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[53]  Weiwen Zhang,et al.  Integrating multiple 'omics' analysis for microbial biology: application and methodologies. , 2010, Microbiology.

[54]  Tong Zhou,et al.  A Universal Trend of Reduced mRNA Stability near the Translation-Initiation Site in Prokaryotes and Eukaryotes , 2010, PLoS Comput. Biol..

[55]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[56]  Pablo Carbonell,et al.  Origins of Specificity and Promiscuity in Metabolic Networks , 2011, The Journal of Biological Chemistry.

[57]  Nicholas T. Ingolia,et al.  Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes , 2011, Cell.

[58]  Gene-Wei Li,et al.  The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria , 2012, Nature.

[59]  C. von Mering,et al.  PaxDb, a Database of Protein Abundance Averages Across All Three Domains of Life , 2012, Molecular & Cellular Proteomics.

[60]  Sujoy Ghosh,et al.  Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology. , 2013, Omics : a journal of integrative biology.

[61]  Bodo Winter,et al.  Linear models and linear mixed effects models in R with linguistic applications , 2013, ArXiv.

[62]  N. Blüthgen,et al.  Mechanisms of translational regulation in bacteria: Impact on codon usage and operon organization , 2013 .

[63]  G. Sanguinetti,et al.  Towards a systems level understanding of the oxygen response of Escherichia coli. , 2014, Advances in microbial physiology.

[64]  Ginestra Bianconi,et al.  Weighted Multiplex Networks , 2013, PloS one.

[65]  C. Furusawa,et al.  Prediction of antibiotic resistance by gene expression profiles , 2014, Nature Communications.

[66]  Barbara M. Bakker,et al.  ADVANCES IN MICROBIAL SYSTEMS BIOLOGY , 2014 .

[67]  M. Ángeles Serrano,et al.  Essential Plasticity and Redundancy of Metabolism Unveiled by Synthetic Lethality Analysis , 2013, PLoS Comput. Biol..

[68]  Miguel Rocha,et al.  Transcript level and sequence determinants of protein abundance and noise in Escherichia coli , 2014, Nucleic acids research.

[69]  Kayhan Erciyes,et al.  Complex Networks: An Algorithmic Perspective , 2014 .

[70]  P. Bickel,et al.  System wide analyses have underestimated protein abundances and the importance of transcription in mammals , 2012, PeerJ.

[71]  Barbara Sitek,et al.  A practical data processing workflow for multi-OMICS projects. , 2014, Biochimica et biophysica acta.

[72]  David H Burkhardt,et al.  Quantifying Absolute Protein Synthesis Rates Reveals Principles Underlying Allocation of Cellular Resources , 2014, Cell.

[73]  E. Airoldi,et al.  Accounting for Experimental Noise Reveals That mRNA Levels, Amplified by Post-Transcriptional Processes, Largely Determine Steady-State Protein Levels in Yeast , 2014, bioRxiv.

[74]  Damian Szklarczyk,et al.  Version 4.0 of PaxDb: Protein abundance data, integrated across model organisms, tissues, and cell‐lines , 2015, Proteomics.

[75]  P. Lio’,et al.  Multi –omics and metabolic modelling pipelines: challenges and tools for systems microbiology , 2015, bioRxiv.

[76]  Bodo Winter,et al.  A Very Basic Tutorial for Performing Linear Mixed Effects Analyses: Tutorial 2 , 2015 .

[77]  Luciano Milanesi,et al.  Methods for the integration of multi-omics data: mathematical aspects , 2016, BMC Bioinformatics.

[78]  Pietro Liò,et al.  Multi omic oscillations in bacterial pathways , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[79]  Shigehiko Kanaya,et al.  A Glimpse to Background and Characteristics of Major Molecular Biological Networks , 2015, BioMed research international.

[80]  Giancarlo Raiconi,et al.  MVDA: a multi-view genomic data integration methodology , 2015, BMC Bioinformatics.

[81]  A. Valleriani,et al.  Bacteria differently regulate mRNA abundance to specifically respond to various stresses , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[82]  J. Rokne,et al.  Multi-scale modularity and motif distributional effect in metabolic networks. , 2015, Current protein & peptide science.

[83]  Ali Ebrahim,et al.  Multi-omic data integration enables discovery of hidden biological regularities , 2016, Nature Communications.

[84]  Prashanth Suravajhala,et al.  Multi-omic data integration and analysis using systems genomics approaches: methods and applications in animal production, health and welfare , 2016, Genetics Selection Evolution.

[85]  Ilias Tagkopoulos,et al.  Multi-omics integration accurately predicts cellular state in unexplored conditions for Escherichia coli , 2016, Nature Communications.

[86]  A. Komar,et al.  The Yin and Yang of codon usage. , 2016, Human molecular genetics.

[87]  K.-Y. Choi,et al.  Spin–orbit coupled molecular quantum magnetism realized in inorganic solid , 2016, Nature Communications.

[88]  Pietro Liò,et al.  Multiplex methods provide effective integration of multi-omic data in genome-scale models , 2016, BMC Bioinformatics.

[89]  Anush Chiappino-Pepe,et al.  Integration of metabolic, regulatory and signaling networks towards analysis of perturbation and dynamic responses , 2017 .

[90]  Kwang-Hyun Cho,et al.  A novel interaction perturbation analysis reveals a comprehensive regulatory principle underlying various biochemical oscillators , 2017, BMC Systems Biology.

[91]  S. Chakraborty,et al.  Codon usage pattern and prediction of gene expression level in Bungarus species. , 2017, Gene.

[92]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..