A comparative study of improvements Pre-filter methods bring on feature selection using microarray data

Feature selection techniques have become an apparent need in biomarker discoveries with the development of microarray. However, the high dimensional nature of microarray made feature selection become timeconsuming. To overcome such difficulties, filter data according to the background knowledge before applying feature selection techniques has become a hot topic in microarray analysis. Different methods may affect final result greatly, thus it is important to evaluate these filter methods in a system way. In this paper, we compare the performance of statistical-based, biological-based filter methods and the combination of them on microRNA-mRNA parallel expression profiles using L1 logistic regression as feature selection techniques. Four types of data were built for both microRNA and mRNA expression profiles. Results showed that with similar or better AUC, precision and less features, filter-based feature selection should be taken into consideration if researchers need fast results when facing complex computing problems in bioinformatics.