Classification with Costly Features using Deep Reinforcement Learning

We study a classification problem where each feature can be acquired for a cost and the goal is to optimize the trade-off between classification precision and the total feature cost. We frame the problem as a sequential decision-making problem, where we classify one sample in each episode. At each step, an agent can use values of acquired features to decide whether to purchase another one or whether to classify the sample. We use vanilla Double Deep Q-learning, a standard reinforcement learning technique, to find a classification policy. We show that this generic approach outperforms Adapt-Gbrt, currently the best-performing algorithm developed specifically for classification with costly features.

[1]  Ming Tan,et al.  Cost-sensitive learning of classification knowledge and its applications in robotics , 2004, Machine Learning.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[4]  Yixin Chen,et al.  Feature-Cost Sensitive Learning with Submodular Trees of Classifiers , 2014, AAAI.

[5]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[6]  Venkatesh Saligrama,et al.  Supervised Sequential Classification Under Budget Constraints , 2013, AISTATS.

[7]  Matt J. Kusner,et al.  Cost-Sensitive Tree of Classifiers , 2012, ICML.

[8]  Venkatesh Saligrama,et al.  Model Selection by Linear Programming , 2014, ECCV.

[9]  Marc G. Bellemare,et al.  Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.

[10]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[11]  Venkatesh Saligrama,et al.  An LP for Sequential Learning Under Budgets , 2014, AISTATS.

[12]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[13]  Ludovic Denoyer,et al.  Datum-Wise Classification: A Sequential Approach to Sparsity , 2011, ECML/PKDD.

[14]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[15]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[18]  Venkatesh Saligrama,et al.  Feature-Budgeted Random Forest , 2015, ICML.

[19]  Shlomi Maliah,et al.  MDP-Based Cost Sensitive Classification Using Decision Trees , 2018, AAAI.

[20]  Damien Ernst,et al.  How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies , 2015, ArXiv.

[21]  Venkatesh Saligrama,et al.  Efficient Learning by Directed Acyclic Graph For Resource Constrained Prediction , 2015, NIPS.

[22]  Kilian Q. Weinberger,et al.  The Greedy Miser: Learning under Test-time Budgets , 2012, ICML.

[23]  Matt J. Kusner,et al.  Classifier cascades and trees for minimizing feature evaluation cost , 2014, J. Mach. Learn. Res..

[24]  Venkatesh Saligrama,et al.  Adaptive Classification for Prediction Under a Budget , 2017, NIPS.

[25]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[26]  Venkatesh Saligrama,et al.  Pruning Random Forests for Prediction on a Budget , 2016, NIPS.

[27]  Ludovic Denoyer,et al.  Recurrent Neural Networks for Adaptive Feature Acquisition , 2016, ICONIP.

[28]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .