Predictive Analysis of Gene Expression Data from Human SAGE Libraries

We study the impact of dimensionality reduction methodologies on the performance of classification methods. Typically not differentially expressed genes and genes presenting small variation (especially on those expressed at lower levels) are considered unimportant for discrimination between different classes and they are removed by filtering techniques from further analysis. We are interested in studying their relevance for classification, which has been left unexplored. We compare results obtained using filtering techniques with those from feature selection approaches. Based on experiments, we demonstrate that applying typical filtering approaches negatively impacts on the predictive accuracy of the induced classifiers.