Individual Channel Analysis of Two-Colour Microarrays

The traditional approach to the analysis of data from two-colour spotted microarrays is to compute the log-ratio of the expression values for each spot (Chen et al, 1997). The log-ratios are then treated as the responses in any statistical analysis of the data (Yang and Speed, 2003; Smyth, 2004). Relatively few papers have analysed spotted microarrays in terms of the separate red and green log-intensities (Kerr et al, 2000; Jin et al, 2001; Wolfinger et al, 2001). The second and third of these papers popularised a mixed model approach in which each spot is treated as a randomised block of size two. A number of papers starting with Yang et al (2001) have summarised red and green channel intensities in terms of M -values (log-ratios) and A-values (spot log-intensities) for the purposes of graphical displays and normalisation. This paper demonstrates that the usefulness of this partition arises in good part from the fact that the M and A-values for a given spot are approximately independent even though the individual intensities are highly correlated. This paper reformulates the mixed model approach in terms of the M and A-values. This approach not only presents an efficient algorithm for estimating the mixed model but also elucidates the difference between the traditional log-ratio based approach and the analysis of individual-channels. The individual-channel approach amounts to recovering information from the between spot error stratum, i.e., from comparisons between the A-values. There are as yet no papers which compare individual-channel with log-ratio analyses. This paper quantifies the efficiency gains which can arise from individual-channel analysis. The paper goes on to develop two new methods for individual-channel analysis which borrow information from the ensemble of probes when making inference about each individual probe. The first is an empirical Bayes method of smoothing the within and between spot components of variance. The second is based on pooling the within-spot correlation estimators. The new methods result in more stable inference than does the usual mixed model approach, especially when the number of arrays is small. Individual channel analysis raises new and non-trivial normalisation issues in addition to those which arise in log-ratio analyses (Yang and Thorne, 2003). In this paper it will be assumed that appropriate normalisation has already been done.