Comparison of standardization approaches applied to metabolomics data

Some factors such as unwanted variations might affect the identification of biomarkers in metabolomics and proteomics analysis, which needs preprocessing including normalization (also named as standardization) by the standardization approach prior to marker selection. Many standardization approaches were applied to analysis of the metabolomics, and even proteomics data. But there are rarely comprehensive comparison of the standardization performance based on the sample size and various methods. The current study performed an overall comparison aiming at these methods based on a metabolomics dataset. As a result, 15 standardization approaches were classified into four groups according to the standardization performances of different sample sizes. The Log Transformation and the VSN method were regarded as the Superior performance methods, but the Contrast method was performed consistently worst in all datasets of various sample size. This study could provide a useful guidance for the choice of befitting standardization approaches when carrying out the metabolomics and proteomics analysis based on LC/MS.

[1]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.

[2]  Kyoungmi Kim,et al.  Metabolomics in the study of kidney diseases , 2012, Nature Reviews Nephrology.

[3]  Joshua D. Knowles,et al.  Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry , 2011, Nature Protocols.

[4]  H. Senn,et al.  Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. , 2006, Analytical chemistry.

[5]  N. Karp,et al.  Addressing Accuracy and Precision Issues in iTRAQ Quantitation* , 2010, Molecular & Cellular Proteomics.

[6]  A. Smilde,et al.  Fusion of mass spectrometry-based metabolomics data. , 2005, Analytical chemistry.

[7]  Keiron O'Shea,et al.  Metabolomic-based biomarker discovery for non-invasive lung cancer screening: A case study. , 2016, Biochimica et biophysica acta.

[8]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[9]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  E. Thévenot,et al.  Analysis of the Human Adult Urinary Metabolome Variations with Age, Body Mass Index, and Gender by Implementing a Comprehensive Workflow for Univariate and OPLS Statistical Analyses. , 2015, Journal of proteome research.

[11]  T. Ebbels,et al.  Improved analysis of multivariate data by variable stability scaling: application to NMR-based metabolic profiling , 2003 .

[12]  David M. Rocke,et al.  Discrimination models using variance-stabilizing transformation of metabolomic NMR data. , 2004, Omics : a journal of integrative biology.

[13]  References , 1971 .

[14]  Asaph Aharoni,et al.  Evaluation of peak picking quality in LC-MS metabolomics data. , 2010, Analytical chemistry.

[15]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[16]  Rima Kaddurah-Daouk,et al.  Metabolomics: A Global Biochemical Approach to the Study of Central Nervous System Diseases , 2009, Neuropsychopharmacology.

[17]  Tomasz Burzykowski,et al.  Evaluation of normalization methods to pave the way towards large-scale LC-MS-based metabolomics profiling experiments. , 2013, Omics : a journal of integrative biology.

[18]  Emmanuel Hatzakis,et al.  Noninvasive urinary metabolomic profiling identifies diagnostic and prognostic markers in lung cancer. , 2014, Cancer research.

[19]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[20]  Erik Johansson,et al.  Using chemometrics for navigating in the large data sets of genomics, proteomics, and metabonomics (gpm) , 2004, Analytical and bioanalytical chemistry.

[21]  R. Spang,et al.  State-of-the art data normalization methods improve NMR-based metabolomic analysis , 2011, Metabolomics.

[22]  Chunxiu Hu,et al.  Mass-spectrometry-based metabolomics analysis for foodomics , 2013 .

[23]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[24]  Rainer Spang,et al.  Data Normalization of (1)H NMR Metabolite Fingerprinting Data Sets in the Presence of Unbalanced Metabolite Regulation. , 2015, Journal of proteome research.

[25]  Christoph Steinbeck,et al.  MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data , 2012, Nucleic Acids Res..

[26]  Johann A. Gagnon-Bartsch,et al.  Statistical methods for handling unwanted variation in metabolomics data. , 2015, Analytical chemistry.