Normalyzer: A Tool for Rapid Evaluation of Normalization Methods for Omics Data Sets

High-throughput omics data often contain systematic biases introduced during various steps of sample processing and data generation. As the source of these biases is usually unknown, it is difficult to select an optimal normalization method for a given data set. To facilitate this process, we introduce the open-source tool “Normalyzer”. It normalizes the data with 12 different normalization methods and generates a report with several quantitative and qualitative plots for comparative evaluation of different methods. The usefulness of Normalyzer is demonstrated with three different case studies from quantitative proteomics and transcriptomics. The results from these case studies show that the choice of normalization method strongly influences the outcome of downstream quantitative comparisons. Normalyzer is an R package and can be used locally or through the online implementation at http://quantitativeproteomics.org/normalyzer.

[1]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[2]  Joshua N. Adkins,et al.  Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition , 2009, Bioinform..

[3]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[4]  G. Church,et al.  Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset , 2005, Genome Biology.

[5]  Claus Lindbjerg Andersen,et al.  Normalization of Real-Time Quantitative Reverse Transcription-PCR Data: A Model-Based Variance Estimation Approach to Identify Genes Suited for Normalization, Applied to Bladder and Colon Cancer Data Sets , 2004, Cancer Research.

[6]  Joel G Pounds,et al.  A statistical selection strategy for normalization procedures in LC‐MS proteomics experiments through dataset‐dependent ranking of normalization scaling factors , 2011, Proteomics.

[7]  Stephen J. Callister,et al.  Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics. , 2006, Journal of proteome research.

[8]  Fredrik Levander,et al.  The proteios software environment: an extensible multiuser platform for management and analysis of proteomics data. , 2009, Journal of proteome research.

[9]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[10]  Per E. Andrén,et al.  Development and Evaluation of Normalization Methods for Label-free Relative Quantification of Endogenous Peptides* , 2009, Molecular & Cellular Proteomics.

[11]  Angelica Lindlöf,et al.  How to Choose a Normalization Strategy for miRNA Quantitative Real-Time (qPCR) Arrays , 2011, J. Bioinform. Comput. Biol..

[12]  Roman A Zubarev,et al.  In Silico Instrumental Response Correction Improves Precision of Label-free Proteomics and Accuracy of Proteomics-based Predictive Models* , 2013, Molecular & Cellular Proteomics.

[13]  F. Levander,et al.  An Adaptive Alignment Algorithm for Quality-controlled Label-free LC-MS* , 2012, Molecular & Cellular Proteomics.

[14]  Richard D. Smith,et al.  Normalization and missing value imputation for label-free LC-MS analysis , 2012, BMC Bioinformatics.

[15]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[16]  Birgit Schilling,et al.  Interlaboratory Study Characterizing a Yeast Performance Standard for Benchmarking LC-MS Platform Performance* , 2009, Molecular & Cellular Proteomics.

[17]  Pei Wang,et al.  Bioinformatics Original Paper a Suite of Algorithms for the Comprehensive Analysis of Complex Protein Mixtures Using High-resolution Lc-ms , 2022 .

[18]  A. Ali,et al.  Paranoid potato , 2012, Plant signaling & behavior.

[19]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[20]  J. Lindon,et al.  Scaling and normalization effects in NMR spectroscopic metabonomic data sets. , 2006, Analytical chemistry.

[21]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[22]  Hua Tang,et al.  Normalization Regarding Non-Random Missing Values in High-Throughput Mass Spectrometry Data , 2005, Pacific Symposium on Biocomputing.