Decision Trees for Function Evaluation: Simultaneous Optimization of Worst and Expected Cost

In several applications of automatic diagnosis and active learning, a central problem is the evaluation of a discrete function by adaptively querying the values of its variables until the values read uniquely determine the value of the function. In general, the process of reading the value of a variable might involve some cost. This cost should be taken into account when deciding the next variable to read. The goal is to design a strategy for evaluating the function incurring little cost (in the worst case or in expectation according to a prior distribution on the possible variables’ assignments). Our algorithm builds a strategy (decision tree) which attains a logarithmic approximation simultaneously for the expected and worst cost spent. This is best possible under the assumption that $${{{\mathcal {P}}}} \ne \mathcal{NP}$$P≠NP.

[1]  C. Scott,et al.  Group-Based Active Query Selection for Rapid Diagnosis in Time-Critical Situations , 2012, IEEE Transactions on Information Theory.

[2]  Mikhail Ju. Moshkov,et al.  Greedy Algorithm with Weights for Decision Tree Construction , 2010, Fundam. Informaticae.

[3]  Jeff A. Bilmes,et al.  Interactive Submodular Set Cover , 2010, ICML.

[4]  Teresa M. Przytycka,et al.  On an Optimal Split Tree Problem , 1999, WADS.

[5]  Jeff A. Bilmes,et al.  Simultaneous Learning and Covering with Adversarial Noise , 2011, ICML.

[6]  Lisa Hellerstein,et al.  Approximation Algorithms for Stochastic Boolean Function Evaluation and Stochastic Submodular Set Cover , 2013, SODA.

[7]  Lisa Hellerstein,et al.  Evaluation of Monotone DNF Formulas , 2015, Algorithmica.

[8]  Viswanath Nagarajan,et al.  Approximation Algorithms for Optimal Decision Trees and Adaptive TSP Problems , 2017, Math. Oper. Res..

[9]  Micah Adler,et al.  Approximating Optimal Binary Decision Trees , 2008, APPROX-RANDOM.

[10]  Lisa Hellerstein,et al.  Evaluation of DNF Formulas , 2014, ISAIM.

[11]  Michael E. Saks,et al.  Probabilistic Boolean decision trees and the complexity of evaluating game trees , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[12]  Ferdinando Cicalese,et al.  On the competitive ratio of evaluating priced functions , 2006, SODA '06.

[13]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[14]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[15]  Mikhail Ju. Moshkov Approximate Algorithm for Minimization of Decision Tree Depth , 2003, RSFDGrC.

[16]  Ran Raz,et al.  A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP , 1997, STOC '97.

[17]  Yogish Sabharwal,et al.  Approximating Decision Trees with Multiway Branches , 2009, ICALP.

[18]  Eduardo Sany Laber,et al.  On the hardness of the minimum height decision tree problem , 2004, Discret. Appl. Math..

[19]  Lawrence L. Larmore,et al.  A fast algorithm for optimal length-limited Huffman codes , 1990, JACM.

[20]  Venkatesan Guruswami,et al.  Query strategies for priced information (extended abstract) , 2000, STOC '00.

[21]  Michael Kearns,et al.  Reinforcement learning for optimized trade execution , 2006, ICML.

[22]  Maxim Sviridenko,et al.  A note on maximizing a submodular set function subject to a knapsack constraint , 2004, Oper. Res. Lett..

[23]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[24]  Bernard M. E. Moret,et al.  Decision Trees and Diagrams , 1982, CSUR.

[25]  Steve Hanneke The Cost Complexity of Interactive Learning , 2006 .

[26]  Ferdinando Cicalese,et al.  Diagnosis determination: decision trees optimizing simultaneously worst and expected testing cost , 2014, ICML.

[27]  Mukesh K. Mohania,et al.  Decision trees for entity identification: approximation algorithms and hardness results , 2007, TALG.

[28]  M. Garey Optimal Binary Identification Procedures , 1972 .

[29]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[30]  L. Wolsey Maximising Real-Valued Submodular Functions: Primal and Dual Heuristics for Location Problems , 1982, Math. Oper. Res..

[31]  Ferdinando Cicalese,et al.  Trading Off Worst and Expected Cost in Decision Tree Problems , 2016, Algorithmica.

[32]  Russell Greiner,et al.  Finding optimal satisficing strategies for and-or trees , 2006, Artif. Intell..

[33]  Michael Tarsi,et al.  Optimal Search on Some Game Trees , 1983, JACM.

[34]  Andreas Krause,et al.  Near-Optimal Bayesian Active Learning with Noisy Observations , 2010, NIPS.

[35]  Tonguç Ünlüyurt,et al.  Sequential testing of complex systems: a review , 2004, Discret. Appl. Math..

[36]  Haim Kaplan,et al.  Learning with attribute costs , 2005, STOC '05.

[37]  A WolseyLaurence Maximising Real-Valued Submodular Functions , 1982 .

[38]  Marco Molinaro,et al.  On Greedy Algorithms for Decision Trees , 2010, ISAAC.

[39]  Jeff A. Bilmes,et al.  Average-Case Active Learning with Costs , 2009, ALT.

[40]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[41]  Steven Skiena,et al.  Decision trees for geometric models , 1993, SCG '93.