The statistical distribution of the intensity of pixels within spots of DNA microarrays: what is the appropriate single-value representative?

This paper opens a discussion about an important issue in the analysis of data from spotted DNA microarrays: how to summarise into a single value the distribution for the intensity values of the pixels within a spot. Although the most popular statistic used is the median, there is no clear study demonstrating why it is more appropriate than other measures of central tendency such as the mean or the mode. Here, we argue that the median intensity is not the most appropriate measure for many common cases and discuss a frequently encountered case of a 'doughnut'-shaped spot for which the mode is closest to the 'expected' spot intensity. For an 'ideal' spot with a clear boundary and uniformly hybridised, the intensity of its pixels should approximately be normally distributed. In practical situations, these two requirements are often not met due to the physical properties of pins and the particularities of the printing and hybridisation processes. As a consequence, the distribution of the intensity of the pixels is usually negatively skewed. This asymmetry results in a larger displacement for the mean and median than for the mode from the ideal situation mentioned above.