Quantitative Quality Control in Microarray Experiments and the Application in Data Filtering, Normalization and False Positive Rate Prediction

Data preprocessing including proper normalization and adequate quality control before complex data mining is crucial for studies using the cDNA microarray technology. We have developed a simple procedure that integrates data filtering and normalization with quantitative quality control of microarray experiments. Previously we have shown that data variability in a microarray experiment can be very well captured by a quality score q(com) that is defined for every spot, and the ratio distribution depends on q(com). Utilizing this knowledge, our data-filtering scheme allows the investigator to decide on the filtering stringency according to desired data variability, and our normalization procedure corrects the q(com)-dependent dye biases in terms of both the location and the spread of the ratio distribution. In addition, we propose a statistical model for false positive rate determination based on the design and the quality of a microarray experiment. The model predicts that a lower limit of 0.5 for the replicate concordance rate is needed in order to be certain of true positives. Our work demonstrates the importance and advantages of having a quantitative quality control scheme for microarrays.

[1]  K. Kadota,et al.  Preprocessing implementation for microarray (PRIM): an efficient method for processing cDNA microarray data. , 2001, Physiological genomics.

[2]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[3]  W. Pan,et al.  How many replicates of arrays are required to detect gene expression changes in microarray experiments? A mixture model approach , 2002, Genome Biology.

[4]  M K Kerr,et al.  Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[6]  X. Wang,et al.  Quantitative quality control in microarray image processing and data acquisition. , 2001, Nucleic acids research.

[7]  E. Wolski,et al.  Normalization strategies for cDNA microarrays. , 2000, Nucleic acids research.

[8]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[9]  Jason C. Mills,et al.  DNA microarrays and beyond: completing the journey from tissue to cell , 2001, Nature Cell Biology.

[10]  Jerry Li,et al.  Within the fold: assessing differential expression measures and reproducibility in microarray assays , 2002, Genome Biology.

[11]  Gregory R. Grant,et al.  Generation of patterns from gene expression data by assigning confidence to differentially expressed genes , 2000, Bioinform..

[12]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[13]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  M. Oh,et al.  Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. , 2001, Nucleic acids research.

[15]  G. A. Whitmore,et al.  Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Soheil Shams,et al.  Information processing issues and solutions associated with microarray technology , 2000 .