Efficient Learning by Directed Acyclic Graph For Resource Constrained Prediction

We study the problem of reducing test-time acquisition costs in classification systems. Our goal is to learn decision rules that adaptively select sensors for each example as necessary to make a confident prediction. We model our system as a directed acyclic graph (DAG) where internal nodes correspond to sensor subsets and decision functions at each node choose whether to acquire a new sensor or classify using the available measurements. This problem can be posed as an empirical risk minimization over training data. Rather than jointly optimizing such a highly coupled and non-convex problem over all decision nodes, we propose an efficient algorithm motivated by dynamic programming. We learn node policies in the DAG by reducing the global objective to a series of cost sensitive learning problems. Our approach is computationally efficient and has proven guarantees of convergence to the optimal system for a fixed architecture. In addition, we present an extension to map other budgeted learning problems with large number of sensors to our DAG architecture and demonstrate empirical performance exceeding state-of-the-art algorithms for data composed of both few and many sensors.

[1]  Venkatesh Saligrama,et al.  Supervised Sequential Classification Under Budget Constraints , 2013, AISTATS.

[2]  Lawrence Carin,et al.  Cost-sensitive feature acquisition and classification , 2007, Pattern Recognit..

[3]  Pallika H. Kanani Prediction-time Active Feature-value Acquisition for Cost-effective Customer Targeting , 2008 .

[4]  Venkatesh Saligrama,et al.  Local Supervised Learning through Space Partitioning , 2012, NIPS.

[5]  Trevor Darrell,et al.  Dynamic Feature Selection for Classification on a Budget , 2013, ICML 2013.

[6]  Venkatesh Saligrama,et al.  Locally-Linear Learning Machines (L3M) , 2013, ACML.

[7]  Kilian Q. Weinberger,et al.  The Greedy Miser: Learning under Test-time Budgets , 2012, ICML.

[8]  Venkatesh Saligrama,et al.  Fast margin-based cost-sensitive classification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Venkatesh Saligrama,et al.  Model Selection by Linear Programming , 2014, ECCV.

[10]  He He,et al.  Imitation Learning by Coaching , 2012, NIPS.

[11]  Matt J. Kusner,et al.  Cost-Sensitive Tree of Classifiers , 2012, ICML.

[12]  Balázs Kégl,et al.  Fast classification using sparse decision DAGs , 2012, ICML.

[13]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[14]  Ludovic Denoyer,et al.  Datum-Wise Classification: A Sequential Approach to Sparsity , 2011, ECML/PKDD.

[15]  Victor S. Sheng,et al.  Feature value acquisition in testing: a sequential batch test algorithm , 2006, ICML.

[16]  Venkatesh Saligrama,et al.  Feature-Budgeted Random Forest , 2015, ICML.

[17]  Yixin Chen,et al.  Feature-Cost Sensitive Learning with Submodular Trees of Classifiers , 2014, AAAI.

[18]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[19]  Stephen Tyree,et al.  Learning with Marginalized Corrupted Features , 2013, ICML.

[20]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[21]  Kilian Q. Weinberger,et al.  Classifier Cascade for Minimizing Feature Evaluation Cost , 2012, AISTATS.

[22]  Ingo Steinwart,et al.  Consistency of support vector machines and other regularized kernel classifiers , 2005, IEEE Transactions on Information Theory.

[23]  Daphne Koller,et al.  Active Classification based on Value of Classifier , 2011, NIPS.

[24]  A. Beygelzimer Multiclass Classification with Filter Trees , 2007 .

[25]  Zhengyou Zhang,et al.  A Survey of Recent Advances in Face Detection , 2010 .

[26]  Venkatesh Saligrama,et al.  An LP for Sequential Learning Under Budgets , 2014, AISTATS.