Feature selection in the classification of high-dimension data

Contemporary biological technologies produce extremely high-dimensional data sets with limited samples which demands feature selection in classifier design. Heretofore, dimensionalities considered in the existing comparative studies for feature selection are nowhere near those currently being used. This study compares some basic feature-selection methods in settings of 20,000 features, where it defines distribution models involving different kinds of relations among the features. The study evaluates the performances of different feature selection algorithms, which show some general trends relative to sample size and relations among the features.