Finding the optimal sequence of features selection based on reinforcement learning

This paper proposes a method for generating an optimal feature selecting sequence which is cost-effective for pattern classification. The sequence describes the order that feature selects for the process like classification. We model the procedure of feature selecting using Markov decision process (MDP), and use dynamic programming (DP) to learn a strategy to generate the orders only with the feedback of circumstance. To simplify the problem, we design a simple test scene that classifying three objects, whose values of synthetic features are generated randomly, into three classes. The results of experiments show that our method can reduce the computational time of extracting features.