Spot shape modelling and data transformations for microarrays

MOTIVATION To study lowly expressed genes in microarray experiments, it is useful to increase the photometric gain in the scanning. However, a large gain may cause some pixels for highly expressed genes to become saturated. Spatial statistical models that model spot shapes on the pixel level may be used to infer information about the saturated pixel intensities. Other possible applications for spot shape models include data quality control and accurate determination of spot centres and spot diameters. RESULTS Spatial statistical models for spotted microarrays are studied including pixel level transformations and spot shape models. The models are applied to a dataset from 50mer oligonucleotide microarrays with 452 selected Arabidopsis genes. Logarithmic, Box-Cox and inverse hyperbolic sine transformations are compared in combination with four spot shape models: a cylindric plateau shape, an isotropic Gaussian distribution and a difference of two-scaled Gaussian distribution suggested in the literature, as well as a proposed new polynomial-hyperbolic spot shape model. A substantial improvement is obtained for the dataset studied by the polynomial-hyperbolic spot shape model in combination with the Box-Cox transformation. The spatial statistical models are used to correct spot measurements with saturation by extrapolating the censored data. AVAILABILITY Source code for R is available at http://www.matfys.kvl.dk/~ekstrom/spotshapes/

[1]  Ernst Wit,et al.  Statistical Adjustment of Signal Censoring in Gene Expression Experiments , 2003, Bioinform..

[2]  Douglas M. Hawkins,et al.  A variance-stabilizing transformation for gene-expression microarray data , 2002, ISMB.

[3]  Ralf Herwig,et al.  Simulation of DNA array hybridization experiments and evaluation of critical parameters during subsequent image and data analysis , 2002, BMC Bioinformatics.

[4]  David M. Rocke,et al.  Approximate Variance-stabilizing Transformations for Gene-expression Microarray Data , 2003, Bioinform..

[5]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[6]  Chris A. Glasbey,et al.  Combinatorial image analysis of DNA microarray features , 2003, Bioinform..

[7]  Patrik R. Jones,et al.  Resistance to an Herbivore Through Engineered Cyanogenic Glucoside Synthesis , 2001, Science.

[8]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[9]  Jiasen Lu,et al.  Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. , 2000, Nucleic acids research.

[10]  Søren Bak,et al.  On the origin of family 1 plant glycosyltransferases. , 2003, Phytochemistry.

[11]  Søren Bak,et al.  Cytochromes P450 , 2002, The arabidopsis book.

[12]  M. Bittner,et al.  Expression profiling using cDNA microarrays , 1999, Nature Genetics.

[13]  S. Bak,et al.  Intron-exon organization and phylogeny in a large superfamily, the paralogous cytochrome P450 genes of Arabidopsis thaliana. , 2000, DNA and cell biology.