Non-parametric Statistical Tests for Informative Gene Selection

This paper presents two non-parametric statistical test methods, called Kolmogorov-Smirnov (KS) and U statistic test methods, respectively, for informative gene selection of a tumor from microarray data, with help of the theory of false discovery rate. To test the effectiveness of these non-parametric statistical test methods, we use the support vector machine (SVM) to construct a tumor diagnosis system (i.e., a binary classifier) based on the identified informative genes on the colon and leukemia data. It is shown by the experiments that the constructed tumor diagnosis system with both the KS and U statistic test methods can reach a good prediction accuracy on both the colon and leukemia data sets.