A Study of the Statistical Distribution of the Intensity of Pixels within Spots of DNA Microarrays : What is the Appropriate Single-Valued Representative ?

In this paper we open a discussion about an important issue found in the analysis of data from spotted DNA microarrays: how to summarise into a single value the distribution for the intensity values of the pixels within a spot. Although the most popular statistic used is the median, there is not any clear study demonstrating why it is more appropriate than other measures of central tendency such as the mean or the mode. In this paper we argue that the median intensity is not the most appropriate measure for many common cases and discuss a commonly encountered case of “donut” shaped spot for which the mode is closest to the “expected” spot intensity. For an “ideal” spot, with a clear boundary and uniformly hybridised, the intensity of its pixels should approximately be normally distributed. In practical situations, these two requirements are often not met due to physical properties of pins and particularities of the printing and hybridisation processes. As a consequence, the distribution of the intensity of pixels is usually negative skewed. This asymmetry results in a larger displacement for the mean and the median than for the mode from the ideal situation mentioned above.