A Ranking Approach for Probe Selection and Classification of Microarray Data with Artificial Neural Networks

Acute leukemia classification into its myeloid and lymphoblastic subtypes is usually accomplished according to the morphology of the tumor. Nevertheless, the subtypes may have similar histopathological appearance, making screening procedures difficult. In addition, approximately one-third of acute myeloid leukemias are characterized by aberrant cytoplasmic localization of nucleophosmin (NPMc(+)), where the majority has a normal karyotype. This work is based on two DNA microarray datasets, available publicly, to differentiate leukemia subtypes. The datasets were split into training and test sets, and feature selection methods were applied. Artificial neural network classifiers were developed to compare the feature selection methods. For the first dataset, 50 genes selected using the best classifier was able to classify all patients in the test set. For the second dataset, five genes yielded 97.5% accuracy in the test set.