THE PROBLEM OF FINDING OPTIMAL SUBSET OF FEATURES

Machine learning and pattern recognition are confronted with the difficulty in selecting subset of features. That is,from a large set of candidate features, selecting a subset of features which are able to represent given examples (samples) consistently. Especially,the problem of finding an optimal subset of features has remained open. This paper, proves that the problem of finding an optimal subset of features is NP-hard, and presents a heuristic algorithm to solve this problem.