Survey on Hybrid Approach for Feature Selection

In text document categorization, feature selection (FS) is a strategy that aims at making text document classifiers more efficient and accurate. However, when dealing with a new task, it is still difficult to quickly select a suitable one from various FS methods provided by many previous studies. Feature selection, as a preprocessing step to machine learning, has been very effective in reducing dimensionality, removing irrelevant data, and noise from data to improving result comprehensibility. Researchers have introduced many feature selection algorithms with different selection criteria. However, it has been discovered that no single criterion is best for all applications. We proposed a hybrid approach for feature selection called based on genetic algorithms (GAs) that employs a target learning algorithm to evaluate features, a wrapper method. The advantages of this approach include the ability to accommodate multiple feature selection criteria and find small subsets of features that perform well for the target algorithm. In this way, heterogeneous documents are summarized and presented in a uniform manner.