Clinical Bioinformatics for Biomarker Discovery in Targeted Metabolomics

In this chapter, methods of clinical bioinformatics in targeted metabolomics are discussed, with an emphasis on the discovery of metabolic biomarkers. The reader is introduced to general aspects such as initiatives in metabolomics standardization, regulatory guidelines and software validation, and is presented an overview of the bioinformatics workflow in metabolomics. Engineering-based concepts of clinical bioinformatics in supporting the storage and automated analysis of samples, the integration of data in public repositories, and in the management of data using metabolomics application software are discussed. Chemometrics algorithms for data processing are summarized, modalities of biostatistics and data analysis presented, as well as data mining and machine learning approaches, aiming at the discovery of biomarkers in targeted metabolomics. Methods of data interpretation in the context of annotated biochemical pathways are suggested, theoretical concepts of metabolic modeling and engineering are introduced, and the in-silico modeling and simulation of molecular processes is briefly touched. Finally, a short outlook on future perspectives in the application of clinical bioinformatics in targeted metabolomics is given, e.g. on the development of integrated mass spectrometry solutions, ready for routine clinical usage in laboratory medicine, or on the application of concepts of artificial intelligence in laboratory automation – liquid handling robots, autonomously performing experiments and generating hypotheses.

[1]  Norbert Wiener,et al.  Cybernetics. , 1948, Scientific American.

[2]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[3]  Mihajlo D. Mesarovic,et al.  Systems Theory and Biology­ View of a Theoretician * , 1968 .

[4]  J. Lopreato,et al.  General system theory : foundations, development, applications , 1970 .

[5]  T. M. Devlin,et al.  Textbook of biochemistry: With clinical correlations , 1982 .

[6]  A. Lehninger Principles of Biochemistry , 1984 .

[7]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[8]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[9]  B. Bakshi Multiscale PCA with application to multivariate statistical process monitoring , 1998 .

[10]  G. Stephanopoulos Metabolic fluxes and metabolic engineering. , 1999, Metabolic engineering.

[11]  Arthur L. Samuel,et al.  Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..

[12]  D. Valle,et al.  Online Mendelian Inheritance In Man (OMIM) , 2000, Human mutation.

[13]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[14]  B. Palsson The challenges of in silico biology , 2000, Nature Biotechnology.

[15]  Eberhard O. Voit,et al.  Computational Analysis of Biochemical Systems: A Practical Guide for Biochemists and Molecular Biologists , 2000 .

[16]  Olaf Wolkenhauer,et al.  Systems Biology: the Reincarnation of Systems Theory Applied in Biology? , 2001, Briefings Bioinform..

[17]  H. Kitano,et al.  Computational systems biology , 2002, Nature.

[18]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[19]  J. Glimm,et al.  Detection of cancer-specific markers amid massive mass spectral data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Masaru Tomita,et al.  E-Cell 2: Multi-platform E-Cell simulation system , 2003, Bioinform..

[21]  Bernhard Liebl,et al.  Advances in analytical mass spectrometry to improve screening for inherited metabolic diseases , 2003, European Journal of Pediatrics.

[22]  Gisbert Schneider,et al.  Support vector machine applications in bioinformatics. , 2003, Applied bioinformatics.

[23]  T. Shaler,et al.  Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. , 2003, Analytical chemistry.

[24]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[25]  D. Kell,et al.  Metabolomics by numbers: acquiring and understanding global metabolite data. , 2004, Trends in biotechnology.

[26]  Chris F. Taylor,et al.  A common open representation of mass spectrometry data and its application to proteomics research , 2004, Nature Biotechnology.

[27]  Nigel W. Hardy,et al.  A proposed framework for the description of plant metabolomics experiments and their results , 2004, Nature Biotechnology.

[28]  Silvio Bicciato Artificial neural network technologies to identify biomarkers for therapeutic intervention. , 2004, Current opinion in molecular therapeutics.

[29]  Robert Tibshirani,et al.  Sample classification from protein mass spectrometry, by 'peak probability contrasts' , 2004, Bioinform..

[30]  Kazuki Saito,et al.  Potential of metabolomics as a functional genomics tool. , 2004, Trends in plant science.

[31]  Paul Smolen,et al.  Simulation of Drosophila circadian oscillations, mutations, and light responses by a model with VRI, PDP-1, and CLK. , 2004, Biophysical journal.

[32]  David M. Rocke,et al.  Discrimination models using variance-stabilizing transformation of metabolomic NMR data. , 2004, Omics : a journal of integrative biology.

[33]  Age K. Smilde,et al.  Analysis of longitudinal metabolomics data , 2004, Bioinform..

[34]  William J. Jusko,et al.  Enhancement of Tissue Delivery and Receptor Occupancy of Methylprednisolone in Rats by a Liposomal Formulation , 1993, Pharmaceutical Research.

[35]  E. Birney,et al.  Reactome: a knowledgebase of biological pathways , 2004, Nucleic Acids Research.

[36]  Jeffrey S. Morris,et al.  Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum , 2005, Bioinform..

[37]  Age K. Smilde,et al.  ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data , 2005, Bioinform..

[38]  W. Weckwerth,et al.  Metabolomics: from pattern recognition to biological interpretation. , 2005, Drug discovery today.

[39]  D L Massart,et al.  Automatic program for peak detection and deconvolution of multi-overlapped chromatographic signals part I: peak detection. , 2005, Journal of chromatography. A.

[40]  Maciek Sasinowski,et al.  What is mzXML good for? , 2005, Expert review of proteomics.

[41]  J. Listgarten,et al.  Statistical and Computational Methods for Comparative Proteomic Profiling Using Liquid Chromatography-Tandem Mass Spectrometry , 2005, Molecular & Cellular Proteomics.

[42]  Anna Gambin,et al.  Hierarchical clustering based upon contextual alignment of proteins: a different way to approach phylogeny. , 2005, Comptes rendus biologies.

[43]  Johan Trygg,et al.  High-throughput data analysis for detecting and identifying differences between samples in GC/MS-based metabolomic analyses. , 2005, Analytical chemistry.

[44]  Yves Gibon,et al.  GMD@CSB.DB: the Golm Metabolome Database , 2005, Bioinform..

[45]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Nigel W. Hardy,et al.  MeMo: a hybrid SQL/XML approach to metabolomic data management for functional genomics , 2006, BMC Bioinformatics.

[47]  P. Mendes,et al.  The origin of correlations in metabolomics data , 2005, Metabolomics.

[48]  C. Ouzounis,et al.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes , 2005, Nucleic acids research.

[49]  R. Abagyan,et al.  METLIN: A Metabolite Mass Spectral Database , 2005, Therapeutic drug monitoring.

[50]  P Mendes,et al.  Modelling and simulation for metabolomics data analysis. , 2005, Biochemical Society transactions.

[51]  John McCarthy,et al.  A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955 , 2006, AI Mag..

[52]  Ralf Steuer,et al.  Review: On the analysis and interpretation of correlations in metabolomic data , 2006, Briefings Bioinform..

[53]  H. Senn,et al.  Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. , 2006, Analytical chemistry.

[54]  Andreas Quandt,et al.  Finding regions of significance in SELDI measurements for identifying protein biomarkers , 2006, Bioinform..

[55]  A. Smilde,et al.  Large-scale human metabolomics studies: a strategy for data (pre-) processing and validation. , 2006, Analytical chemistry.

[56]  Kathryn A. Phillips,et al.  Diagnostics and biomarker development: priming the pipeline , 2006, Nature Reviews Drug Discovery.

[57]  John Draper,et al.  Predicting interpretability of metabolome models based on behavior, putative identity, and biological relevance of explanatory signals , 2006, Proceedings of the National Academy of Sciences.

[58]  Qi Zhao,et al.  HiRes - a tool for comprehensive assessment and interpretation of metabolomic data , 2006, Bioinform..

[59]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[60]  Jeffrey S. Morris,et al.  PrepMS: TOF MS data graphical preprocessing tool , 2007, Bioinform..

[61]  Nigel W. Hardy,et al.  The Metabolomics Standards Initiative , 2007, Nature Biotechnology.

[62]  B. M. Lange,et al.  Experimental and mathematical approaches to modeling plant metabolic networks. , 2007, Phytochemistry.

[63]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[64]  Christian Baumgartner,et al.  Data mining and knowledge discovery in metabolomics , 2007 .

[65]  Joachim Selbig,et al.  pcaMethods - a bioconductor package providing PCA methods for incomplete data , 2007, Bioinform..

[66]  Martin Scholz,et al.  Pacific Symposium on Biocomputing 12:169-180(2007) SETUP X – A PUBLIC STUDY DESIGN DATABASE FOR METABOLOMIC PROJECTS , 2022 .

[67]  R. J. O. Torgrip,et al.  A note on normalization of biofluid 1D 1H-NMR data , 2008, Metabolomics.

[68]  M. Orešič,et al.  Data processing for mass spectrometry-based metabolomics. , 2007, Journal of chromatography. A.

[69]  David S. Wishart,et al.  Current Progress in computational metabolomics , 2007, Briefings Bioinform..

[70]  Björn H. Junker,et al.  Computational Models of Metabolism: Stability and Regulation in Metabolic Networks , 2008 .

[71]  Bernhard Pfeifer,et al.  A new rule-based algorithm for identifying metabolic markers in prostate cancer using tandem mass spectrometry , 2008, Bioinform..

[72]  Jan-Eric Litton,et al.  Biobanking for Europe , 2007, Briefings Bioinform..

[73]  Ann Richard,et al.  ACToR--Aggregated Computational Toxicology Resource. , 2008, Toxicology and applied pharmacology.

[74]  Oliver Kohlbacher,et al.  MetaRoute: fast search for relevant metabolic routes for interactive network navigation and visualization , 2008, Bioinform..

[75]  Bernhard Tilg,et al.  Dynamic simulations on the mitochondrial fatty acid Beta-oxidation network , 2009, BMC Systems Biology.

[76]  Nigel W. Hardy,et al.  Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project , 2008, Nature Biotechnology.

[77]  Nigel W. Hardy,et al.  The first RSBI (ISA-TAB) workshop: "can a simple format work for complex studies?". , 2008, Omics : a journal of integrative biology.

[78]  Klaus M. Weinberger,et al.  Einsatz von Metabolomics zur Diagnose von Stoffwechselkrankheiten , 2008 .

[79]  Ram B Jain,et al.  Evaluation of maximum likelihood procedures to estimate left censored observations. , 2008, Analytical chemistry.

[80]  Christian Gieger,et al.  Genetics Meets Metabolomics: A Genome-Wide Association Study of Metabolite Profiles in Human Serum , 2008, PLoS genetics.

[81]  Atul J. Butte,et al.  Viewpoint Paper: Translational Bioinformatics: Coming of Age , 2008, J. Am. Medical Informatics Assoc..

[82]  Wanchang Lin,et al.  Preprocessing, classification modeling and feature selection using flow injection electrospray mass spectrometry metabolite fingerprint data , 2008, Nature Protocols.

[83]  Emilie Duval,et al.  High-throughput, nontargeted metabolite fingerprinting using nominal mass flow injection electrospray mass spectrometry , 2008, Nature Protocols.

[84]  Günther Eibl,et al.  Isotope correction of mass spectrometry profiles. , 2008, Rapid communications in mass spectrometry : RCM.

[85]  Bernhard Pfeifer,et al.  A new ensemble-based algorithm for identifying breath gas marker candidates in liver disease using ion molecule reaction mass spectrometry , 2009, Bioinform..

[86]  W. R. Wikoff,et al.  Variability analysis of human plasma and cerebral spinal fluid reveals statistical significance of changes in mass spectrometry-based metabolomics data. , 2009, Analytical chemistry.

[87]  David S. Wishart,et al.  HMDB: a knowledgebase for the human metabolome , 2008, Nucleic Acids Res..

[88]  Charles S Henry,et al.  Review: Microfluidic applications in metabolomics and metabolic profiling. , 2009, Analytica chimica acta.

[89]  A. K. Smilde,et al.  Dynamic metabolomic data analysis: a tutorial review , 2009, Metabolomics.

[90]  Antoine M. van Oijen,et al.  Real-time single-molecule observation of rolling-circle DNA replication , 2009, Nucleic acids research.

[91]  David S. Wishart,et al.  MetaboAnalyst: a web server for metabolomic data analysis and interpretation , 2009, Nucleic Acids Res..

[92]  Mark R Viant,et al.  Spectral relative standard deviation: a practical benchmark in metabolomics. , 2009, The Analyst.

[93]  Chuan Lu,et al.  An investigation into the population abundance distribution of mRNAs, proteins, and metabolites in biological systems , 2009, Bioinform..

[94]  D. Bylund,et al.  An automatic peak finding method for LC-MS data using Gaussian second derivative filtering. , 2009, Journal of separation science.

[95]  B. Warrack,et al.  Normalization strategies for metabonomic analysis of urine samples. , 2009, Journal of chromatography. B, Analytical technologies in the biomedical and life sciences.

[96]  Wanchang Lin,et al.  Metabolite signal identification in accurate mass metabolomics data with MZedDB, an interactive m/z annotation tool utilising predicted ionisation behaviour 'rules' , 2009, BMC Bioinformatics.

[97]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[98]  Bernhard Pfeifer,et al.  A new data mining approach for profiling and categorizing kinetic patterns of metabolic biomarkers after myocardial injury , 2010, Bioinform..

[99]  David S. Wishart,et al.  MSEA: a web-based tool to identify biologically meaningful patterns in quantitative metabolomic data , 2010, Nucleic Acids Res..

[100]  Maria Liakata,et al.  An Integrated Laboratory Robotic System for Autonomous Discovery of Gene Function , 2010 .

[101]  David S. Wishart,et al.  SMPDB: The Small Molecule Pathway Database , 2009, Nucleic Acids Res..

[102]  Christian Baumgartner,et al.  Bioinformatic-driven search for metabolic biomarkers in disease , 2011, Journal of Clinical Bioinformatics.

[103]  David S. Wishart,et al.  T3DB: a comprehensively annotated database of common toxins and their targets , 2009, Nucleic Acids Res..

[104]  Christian Gieger,et al.  A genome-wide perspective of genetic variation in human metabolism , 2010, Nature Genetics.

[105]  Antony J. Williams,et al.  ChemSpider:: An Online Chemical Information Resource , 2010 .

[106]  Todd R. Johnson,et al.  What is biomedical informatics? , 2010, J. Biomed. Informatics.

[107]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[108]  Christian Baumgartner,et al.  Profiling the human response to physical exercise: a computational strategy for the identification and kinetic analysis of metabolic biomarkers , 2011, Journal of Clinical Bioinformatics.

[109]  David P Enot,et al.  Bioinformatics for mass spectrometry-based metabolomics. , 2011, Methods in molecular biology.

[110]  Paulien Hogeweg,et al.  The Roots of Bioinformatics in Theoretical Biology , 2011, PLoS Comput. Biol..

[111]  Xiangdong Wang,et al.  Clinical bioinformatics: a new emerging science , 2011, Journal of Clinical Bioinformatics.

[112]  Thomas Hankemeier,et al.  Lab-on-a-chip technologies for massive parallel data generation in the life sciences: A review , 2011 .

[113]  M. Vogeser,et al.  Progress in automation of LC-MS in laboratory medicine. , 2011, Clinical biochemistry.

[114]  C. Gieger,et al.  Human metabolic individuality in biomedical and pharmaceutical research , 2011, Nature.

[115]  Monica Chagoyen,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[116]  Kai Zhao,et al.  MRSD: a web server for Metabolic Route Search and Design , 2011, Bioinform..

[117]  Age K. Smilde,et al.  Between Metabolite Relationships: an essential aspect of metabolic change , 2011, Metabolomics.

[118]  Laurin A. J. Mueller,et al.  A network-based approach to classify the three domains of life , 2011, Biology Direct.

[119]  Giovanni Montana,et al.  A statistical framework for biomarker discovery in metabolomic time course data , 2011, Bioinform..

[120]  David S. Wishart,et al.  Chapter 3: Small Molecules and Disease , 2012, PLoS Comput. Biol..

[121]  Ralf Tautenhahn,et al.  An accelerated workflow for untargeted metabolomics using the METLIN database , 2012, Nature Biotechnology.

[122]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[123]  G. Siuzdak,et al.  Innovation: Metabolomics: the apogee of the omics trilogy , 2012, Nature Reviews Molecular Cell Biology.

[124]  Christian Baumgartner,et al.  A network-based feature selection approach to identify metabolic signatures in disease. , 2012, Journal of theoretical biology.

[125]  David Cox,et al.  Toward a roadmap in global biobanking for health , 2012, European Journal of Human Genetics.

[126]  Marco Masseroli,et al.  Clinical Bioinformatics: challenges and opportunities , 2012, BMC Bioinformatics.

[127]  G. Denardo,et al.  Concepts, consequences, and implications of theranosis. , 2012, Seminars in nuclear medicine.

[128]  Christoph Steinbeck,et al.  MetaboLights: towards a new COSMOS of metabolomics data management , 2012, Metabolomics.

[129]  David S. Wishart,et al.  MetaboAnalyst 2.0—a comprehensive server for metabolomic data analysis , 2012, Nucleic Acids Res..

[130]  Christian Gieger,et al.  Genetic variation in metabolic phenotypes: study designs and applications , 2012, Nature Reviews Genetics.

[131]  Tobias Mettler,et al.  What constitutes the field of health information systems? Fostering a systematic framework and research agenda , 2012, Health Informatics J..

[132]  Masaru Tomita,et al.  Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis , 2012, Current bioinformatics.

[133]  Anne E. Trefethen,et al.  Toward interoperable bioscience data , 2012, Nature Genetics.

[134]  Christoph Steinbeck,et al.  MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data , 2012, Nucleic Acids Res..

[135]  Winston Haynes,et al.  Integrative Analysis of Longitudinal Metabolomics Data from a Personal Multi-Omics Profile , 2013, Metabolites.

[136]  Frank Emmert-Streib,et al.  Structural Properties and Complexity of a New Network Class: Collatz Step Graphs , 2013, PloS one.

[137]  Yutaka Yamada,et al.  PRIMe Update: Innovative Content for Plant Metabolomics and Integration of Gene Expression and Metabolite Accumulation , 2013, Plant & cell physiology.

[138]  Monica Chagoyen,et al.  Tools for the functional interpretation of metabolomic experiments , 2013, Briefings Bioinform..

[139]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[140]  Christoph Steinbeck,et al.  Dissemination of metabolomics results: role of MetaboLights and COSMOS , 2013, GigaScience.

[141]  F. Emmert-Streib,et al.  Dry computational approaches for wet medical problems , 2014, Journal of Translational Medicine.

[142]  Richard D. Beger,et al.  A Review of Applications of Metabolomics in Cancer , 2013, Metabolites.

[143]  Masaru Tomita,et al.  Dynamic Simulation and Metabolome Analysis of Long-Term Erythrocyte Storage in Adenine–Guanosine Solution , 2013, PloS one.

[144]  P. Blankestijn,et al.  The Biobank of Nephrological Diseases in the Netherlands cohort: the String of Pearls Initiative collaboration on chronic kidney disease in the university medical centers in the Netherlands. , 2014, Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association.

[145]  Michèle Sebag,et al.  A tour of machine learning: An AI perspective , 2014, AI Commun..

[146]  Matthias Dehmer,et al.  NetBioV: an R package for visualizing large network data in biology and medicine , 2014, Bioinform..

[147]  Christian Baumgartner,et al.  Modeling and Classification of Kinetic Patterns of Dynamic Metabolic Biomarkers in Physical Activity , 2015, PLoS Comput. Biol..

[148]  Philip E. Bourne,et al.  Ten Years of PLoS‡ Computational Biology: A Decade of Appreciation and Innovation , 2015, PLoS Comput. Biol..

[149]  Olaf Sporns,et al.  From “What Is?” to “What Isn't?” Computational Biology , 2015, PLoS Comput. Biol..

[150]  Ruth Nussinov Advancements and Challenges in Computational Biology , 2015, PLoS Comput. Biol..

[151]  Christian Baumgartner,et al.  Data handling and analysis in metabolomics , 2015 .

[152]  Joshua M. Dudik,et al.  A comparative analysis of DBSCAN, K-means, and quadratic variation algorithms for automatic identification of swallows from swallowing accelerometry signals , 2015, Comput. Biol. Medicine.

[153]  Emily L. Kang,et al.  Computational and statistical analysis of metabolomics data , 2015, Metabolomics.

[154]  Diane E. Kovats,et al.  Computational Biology: Moving into the Future One Click at a Time , 2015, PLoS Comput. Biol..

[155]  Matej Oresic,et al.  COordination of Standards in MetabOlomicS (COSMOS): facilitating integrated metabolomics data access , 2015, Metabolomics.

[156]  Eberhard O. Voit,et al.  150 Years of the Mass Action Law , 2015, PLoS Comput. Biol..

[157]  M. Perola,et al.  BBMRI-ERIC as a resource for pharmaceutical and life science industries: the development of biobank-based Expert Centres , 2014, European Journal of Human Genetics.