A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data

MOTIVATION Microarray experiments are affected by numerous sources of non-biological variation that contribute systematic bias to the resulting data. In a dual-label (two-color) cDNA or long-oligonucleotide microarray, these systematic biases are often manifested as an imbalance of measured fluorescent intensities corresponding to Sample A versus those corresponding to Sample B. Systematic biases also affect between-slide comparisons. Making effective corrections for these systematic biases is a requisite for detecting the underlying biological variation between samples. Effective data normalization is therefore an essential step in the confident identification of biologically relevant differences in gene expression profiles. Several normalization methods for the correction of systemic bias have been described. While many of these methods have addressed intensity-dependent bias, few have addressed both intensity-dependent and spatiality-dependent bias. RESULTS We present a neural network-based normalization method for correcting the intensity- and spatiality-dependent bias in cDNA microarray datasets. In this normalization method, the dependence of the log-intensity ratio (M) on the average log-intensity (A) as well as on the spatial coordinates (X,Y) of spots is approximated with a feed-forward neural network function. Resistance to outliers is provided by assigning weights to each spot based on how distant their M values is from the median over the spots whose A values are similar, as well as by using pseudospatial coordinates instead of spot row and column indices. A comparison of the robust neural network method with other published methods demonstrates its potential in reducing both intensity-dependent bias and spatial-dependent bias, which translates to more reliable identification of truly regulated genes.

[1]  Taesung Park,et al.  Evaluation of normalization methods for microarray data , 2003 .

[2]  Dale L. Wilson,et al.  New Normalization Methods for CDNA Microarray Data , 2003, Bioinform..

[3]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.

[4]  T. Speed,et al.  Statistical issues in cDNA microarray data analysis. , 2003, Methods in molecular biology.

[5]  K. Dobbin,et al.  Experimental design of DNA microarray experiments. , 2003, BioTechniques.

[6]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[7]  Walter Krämer,et al.  Review of Modern applied statistics with S, 4th ed. by W.N. Venables and B.D. Ripley. Springer-Verlag 2002 , 2003 .

[8]  Yogendra P. Chaubey Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[9]  Pierre R. Bushel,et al.  STATISTICAL ANALYSIS OF A GENE EXPRESSION MICROARRAY EXPERIMENT WITH REPLICATION , 2002 .

[10]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[11]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[12]  S. S. Young,et al.  Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[13]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[14]  Jean Yee Hwa Yang,et al.  Analysis of CDNA Microarray Images , 2001, Briefings Bioinform..

[15]  Soheil Shams,et al.  Noise Sampling Method: An ANOVA Approach Allowing Robust Selection of Differentially Regulated Genes Measured by DNA Microarrays , 2003, Bioinform..

[16]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  R. Nadon,et al.  Statistical issues with microarrays: processing and analysis. , 2002, Trends in genetics : TIG.

[18]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[19]  A. Lagarde,et al.  DNA Microarrays: A Molecular Cloning Manual. , 2003 .

[20]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[21]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[22]  James J. Chen,et al.  Analysis of variance components in gene expression data , 2004, Bioinform..

[23]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[24]  Brian D. Ripley,et al.  Modern applied statistics with S, 4th Edition , 2002, Statistics and computing.

[25]  L. Qin,et al.  Empirical evaluation of data transformations and ranking statistics for microarray analysis. , 2004, Nucleic acids research.

[26]  S. Dudoit,et al.  Microarray expression profiling identifies genes with altered expression in HDL-deficient mice. , 2000, Genome research.

[27]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .