The distribution-based p-value for the outlier sum in differential gene expression analysis.

Outlier sums were proposed by Tibshirani & Hastie (2007) and Wu (2007) for detecting outlier genes where only a small subset of disease samples shows unusually high gene expression, but they did not develop their distributional properties and formal statistical inference. In this study, a new outlier sum for detection of outlier genes is proposed, its asymptotic distribution theory is developed, and the p-value based on this outlier sum is formulated. Its analytic form is derived on the basis of the large-sample theory. We compare the proposed method with existing outlier sum methods by power comparisons. Our method is applied to DNA microarray data from samples of primary breast tumors examined by Huang et al. (2003). The results show that the proposed method is more efficient in detecting outlier genes.