MathIOmica: An Integrative Platform for Dynamic Omics

Multiple omics data are rapidly becoming available, necessitating the use of new methods to integrate different technologies and interpret the results arising from multimodal assaying. The MathIOmica package for Mathematica provides one of the first extensive introductions to the use of the Wolfram Language to tackle such problems in bioinformatics. The package particularly addresses the necessity to integrate multiple omics information arising from dynamic profiling in a personalized medicine approach. It provides multiple tools to facilitate bioinformatics analysis, including importing data, annotating datasets, tracking missing values, normalizing data, clustering and visualizing the classification of data, carrying out annotation and enumeration of ontology memberships and pathway analysis. We anticipate MathIOmica to not only help in the creation of new bioinformatics tools, but also in promoting interdisciplinary investigations, particularly from researchers in mathematical, physical science and engineering fields transitioning into genomics, bioinformatics and omics data integration.

[1]  D. Karolchik,et al.  The UCSC Genome Browser database: 2016 update , 2015, bioRxiv.

[2]  Marc-Thorsten Hütt,et al.  Methoden der Bioinformatik , 2016 .

[3]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[4]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[5]  Gary D Bader,et al.  A draft map of the human proteome , 2014, Nature.

[6]  Bruce E. Shapiro,et al.  MathSBML: a package for manipulating SBML-based biological models , 2004, Bioinform..

[7]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[8]  S. Ravi Bayesian Logical Data Analysis for the Physical Sciences: a Comparative Approach with Mathematica® Support , 2007 .

[9]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[10]  George I. Mias,et al.  Personal genomes, quantitative dynamic omics and personalized medicine , 2013, Quantitative Biology.

[11]  Todd Allen Detecting Differential Gene Expression Using Affymetrix Microarrays , 2013 .

[12]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[13]  Tom R. Gaunt,et al.  The UK10K project identifies rare variants in health and disease , 2016 .

[14]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[15]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[16]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[17]  Gang Wu,et al.  MetaCycle: an integrated R package to evaluate periodicity in large scale data , 2016, bioRxiv.

[18]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[19]  W. J. Kent,et al.  The UCSC Genome Browser , 2003, Current protocols in bioinformatics.

[20]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[21]  N. Lomb Least-squares frequency analysis of unequally spaced data , 1976 .

[22]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[23]  Pornpimol Charoentong,et al.  Computational genomics tools for dissecting tumour–immune cell interactions , 2016, Nature Reviews Genetics.

[24]  J. Scargle Studies in astronomical time series analysis. III - Fourier transforms, autocorrelation functions, and cross-correlation functions of unevenly spaced data , 1989 .

[25]  L. Lesko,et al.  Individualization of Drug Therapy: History, Present State, and Opportunities for the Future , 2012, Clinical pharmacology and therapeutics.

[26]  Edward R. Dougherty,et al.  Detecting Periodic Genes from Irregularly Sampled Gene Expressions: A Comparison Study , 2008, EURASIP J. Bioinform. Syst. Biol..

[27]  R. Altman,et al.  Pharmacogenomics Knowledge for Personalized Medicine , 2012, Clinical pharmacology and therapeutics.

[28]  Jorge Caiado,et al.  A periodogram-based metric for time series classification , 2006, Comput. Stat. Data Anal..

[29]  H P Van Dongen,et al.  Letter to the Editor: Analysis of Problematic Time Series with the LombÐScargle Method, A Reply to ‘Emphasizing Difficulties in the Detection of Rhythms with LombÐScargle Periodograms’ , 2001, Biological rhythm research.

[30]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[31]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[32]  Masaru Tomita,et al.  MathDAMP: a package for differential analysis of metabolite profiles , 2006, BMC Bioinformatics.

[33]  J. Scargle Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data , 1982 .

[34]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[35]  Y. Moreau,et al.  Computational tools for prioritizing candidate genes: boosting disease gene discovery , 2012, Nature Reviews Genetics.

[36]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[37]  G. Larry Bretthorst,et al.  Frequency Estimation and Generalized Lomb-Scargle Periodograms , 2003 .

[38]  E. Deutsch mzML: A single, unifying data format for mass spectrometer output , 2008, Proteomics.

[39]  ChenJie,et al.  Detecting periodic patterns in unevenly spaced gene expression time series using Lomb--Scargle periodograms , 2006 .

[40]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[41]  M. Snyder,et al.  Multimodal Dynamic Profiling of Healthy and Diseased States for Future Personalized Health Care , 2013, Clinical pharmacology and therapeutics.

[42]  S. Wolfram An Elementary Introduction to the Wolfram Language , 2015 .

[43]  L J Lesko,et al.  Quantitative Analysis to Guide Orphan Drug Development , 2012, Clinical pharmacology and therapeutics.

[44]  Omar E. Cornejo,et al.  Phased Whole-Genome Genetic Risk in a Family Quartet Using a Major Allele Reference Sequence , 2011, PLoS genetics.

[45]  H. Kitano,et al.  Software for systems biology: from tools to integrated platforms , 2011, Nature Reviews Genetics.

[46]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[47]  Raphael Gottardo,et al.  Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.

[48]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[49]  G. Siuzdak,et al.  Innovation: Metabolomics: the apogee of the omics trilogy , 2012, Nature Reviews Molecular Cell Biology.

[50]  P. Gregory Bayesian Logical Data Analysis for the Physical Sciences: A Comparative Approach with Mathematica® Support , 2005 .

[51]  E. Mardis Next-generation sequencing platforms. , 2013, Annual review of analytical chemistry.

[52]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[53]  Bryony Jones,et al.  Genomics: Personal genome project , 2012, Nature Reviews Genetics.

[54]  M Schimmel,et al.  Emphasizing Difficulties in the Detection of Rhythms with Lomb-Scargle Periodograms , 2001, Biological rhythm research.

[55]  Michelle Whirl-Carrillo,et al.  From pharmacogenomic knowledge acquisition to clinical applications: the PharmGKB as a clinical pharmacogenomic biomarker resource. , 2011, Biomarkers in medicine.

[56]  Leonor Saiz,et al.  CplexA: a Mathematica package to study macromolecular-assembly control of gene expression , 2010, Bioinform..

[57]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[58]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[59]  M. Snyder,et al.  High-throughput sequencing technologies. , 2015, Molecular cell.

[60]  J. Scargle Studies in astronomical time series analysis. I - Modeling random processes in the time domain , 1981 .

[61]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[62]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[63]  A global reference for human genetic variation , 2015, Nature.

[64]  Tahir Yusufaly,et al.  Metabolome progression during early gut microbial colonization of gnotobiotic mice , 2015, Scientific Reports.

[65]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.