Comparison of normalization methods for CodeLink Bioarray data

BackgroundThe quality of microarray data can seriously affect the accuracy of downstream analyses. In order to reduce variability and enhance signal reproducibility in these data, many normalization methods have been proposed and evaluated, most of which are for data obtained from cDNA microarrays and Affymetrix GeneChips. CodeLink Bioarrays are a newly emerged, single-color oligonucleotide microarray platform. To date, there are no reported studies that evaluate normalization methods for CodeLink Bioarrays.ResultsWe compared five existing normalization approaches, in terms of both noise reduction and signal retention: Median (suggested by the manufacturer), CyclicLoess, Quantile, Iset, and Qspline. These methods were applied to two real datasets (a time course dataset and a lung disease-related dataset) generated by CodeLink Bioarrays and were assessed using multiple statistical significance tests. Compared to Median, CyclicLoess and Qspline exhibit a significant and the most consistent improvement in reduction of variability and retention of signal. CyclicLoess appears to retain more signal than Qspline. Quantile reduces more variability than Median in both datasets, yet fails to consistently retain more signal in the time course dataset. Iset does not improve over Median in either noise reduction or signal enhancement in the time course dataset.ConclusionMedian is insufficient either to reduce variability or to retain signal effectively for CodeLink Bioarray data. CyclicLoess is a more suitable approach for normalizing these data. CyclicLoess also seems to be the most effective method among the five different normalization strategies examined.

[1]  D. Koller,et al.  From signatures to models: understanding cancer using microarrays , 2005, Nature Genetics.

[2]  Wei Wu,et al.  Evaluation of normalization methods for cDNA microarray data by k-NN classification , 2005, BMC Bioinformatics.

[3]  Nir Friedman,et al.  Comparative analysis of algorithms for signal quantitation from oligonucleotide microarrays , 2004, Bioinform..

[4]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[5]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[6]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.

[7]  Taesung Park,et al.  Evaluation of normalization methods for microarray data , 2003 .

[8]  M. Bissell The differentiated state of normal and malignant cells or how to define a "normal" cell in culture. , 1981, International review of cytology.

[9]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[10]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[11]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[12]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[13]  Terence P. Speed,et al.  A benchmark for Affymetrix GeneChip expression measures , 2004, Bioinform..

[14]  Cheng Li,et al.  DNA-Chip Analyzer (dChip) , 2003 .

[15]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[16]  David G. Morris,et al.  Global analysis of gene expression in pulmonary fibrosis reveals distinct programs regulating lung inflammation and fibrosis. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Eric S. Lander,et al.  Genomic analysis of metastasis reveals an essential role for RhoC , 2000, Nature.

[18]  S. Dudoit,et al.  Multiple Hypothesis Testing in Microarray Experiments , 2003 .

[19]  Raymond J Carroll,et al.  Chemopreventive n-3 Polyunsaturated Fatty Acids Reprogram Genetic Signatures during Colon Cancer Initiation and Progression in the Rat , 2004, Cancer Research.

[20]  Terry M. Therneau,et al.  Faster cyclic loess: normalizing RNA arrays via linear models , 2004, Bioinform..

[21]  Naftali Kaminski,et al.  Up-Regulation and Profibrotic Role of Osteopontin in Human Idiopathic Pulmonary Fibrosis , 2005, PLoS medicine.

[22]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[23]  M. Goldstein,et al.  Analysis of Gene Expression Data , 2022 .

[24]  M. Oh,et al.  Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. , 2001, Nucleic acids research.

[25]  Naftali Kaminski,et al.  Comprehensive gene expression profiles reveal pathways related to the pathogenesis of chronic obstructive pulmonary disease. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[26]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[27]  X. Cui,et al.  Transformations for cDNA Microarray Data , 2003, Statistical applications in genetics and molecular biology.

[28]  Terence P. Speed,et al.  Normalization for cDNA microarry data , 2001, SPIE BiOS.

[29]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[30]  David Edwards,et al.  Non-linear Normalization and Background Correction in One-channel CDNA Microarray Studies , 2003, Bioinform..

[31]  R. Shippy,et al.  Performance evaluation of commercial short-oligonucleotide microarrays and the impact of noise in making cross-platform correlations , 2004, BMC Genomics.

[32]  Tommi S. Jaakkola,et al.  Maximum-likelihood estimation of optimal scaling factors for expression array normalization , 2001, SPIE BiOS.

[33]  Thomas Lengauer,et al.  Centralization: a new method for the normalization of gene expression data , 2001, ISMB.

[34]  Hongwei Cheng,et al.  Transcriptional characterization of bone morphogenetic proteins (BMPs)‐mediated osteogenic signaling , 2003, Journal of cellular biochemistry.

[35]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[36]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[37]  C. Li,et al.  Feature extraction and normalization algorithms for high‐density oligonucleotide gene expression array data , 2001, Journal of cellular biochemistry. Supplement.