A granular computing approach to gene selection.

Gene selection is a key step in performing cancer classification with DNA microarrays. The challenges from high dimension and small sample size of microarray dataset still exist. On rough set theory applied to gene selection, many algorithms have been presented, but most are time-consuming. In this paper, a granular computing-based gene selection as a new method is proposed. First, some granular computing-based concepts are introduced and then some of their important properties are derived. The relationship between positive region-based reduct and granular space-based reduct is discussed. Then, a significance measure of feature is proposed to improve the efficiency and decrease the complexity of classical algorithm. By using Hashtable and input sequence techniques, a fast heuristic algorithm is constructed for the better computational efficiency of gene selection for cancer classification. Extensive experiments are conducted on five public gene expression data sets and seven data sets from UCI respectively. The experimental results confirm the efficiency and effectiveness of the proposed algorithm.