Approximation Algorithms for Optimal Decision Trees and Adaptive TSP Problems

We consider the problem of constructing optimal decision trees: given a collection of tests that can disambiguate between a set of m possible diseases, each test having a cost, and the a priori likelihood of any particular disease, what is a good adaptive strategy to perform these tests to minimize the expected cost to identify the disease? This problem has been studied in several works, with O(log m)-approximations known in the special cases when either costs or probabilities are uniform. In this paper, we settle the approximability of the general problem by giving a tight O(log m)-approximation algorithm. We also consider a substantial generalization, the adaptive traveling salesman problem. Given an underlying metric space, a random subset S of vertices is drawn from a known distribution, but S is initially unknown—we get information about whether any vertex is in S only when it is visited. What is a good adaptive strategy to visit all vertices in the random subset S while minimizing the expected dista...

[1]  Sudipto Guha,et al.  Rounding via Trees : Deterministic Approximation Algorithms forGroup , 1998 .

[2]  Yogish Sabharwal,et al.  Approximating Decision Trees with Multiway Branches , 2009, ICALP.

[3]  Donald W. Loveland Performance bounds for binary testing with arbitrary weights , 2004, Acta Informatica.

[4]  Jan Vondrák,et al.  Approximating the stochastic knapsack problem: the benefit of adaptivity , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[5]  Chandra Chekuri,et al.  A recursive greedy algorithm for walks in directed graphs , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[6]  Sudipto Guha,et al.  Approximation algorithms for budgeted learning problems , 2007, STOC '07.

[7]  Maxim Sviridenko,et al.  A note on maximizing a submodular set function subject to a knapsack constraint , 2004, Oper. Res. Lett..

[8]  Satish Rao,et al.  A tight bound on approximating arbitrary metrics by tree metrics , 2003, STOC '03.

[9]  László Lovász,et al.  Approximating Min Sum Set Cover , 2004, Algorithmica.

[10]  Sudipto Guha,et al.  Multi-armed Bandits with Metric Switching Costs , 2009, ICALP.

[11]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[12]  Guevara Noubir,et al.  Universal approximations for TSP, Steiner tree, and set cover , 2005, STOC '05.

[13]  Ellen W. Zegura,et al.  Controlling the mobility of multiple data transport ferries in a delay-tolerant network , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[14]  Tim Roughgarden,et al.  Single-Source Stochastic Routing , 2006, APPROX-RANDOM.

[15]  Jennifer Widom,et al.  Optimization of continuous queries with shared expensive filters , 2007, PODS.

[16]  R. Ravi,et al.  Approximation Algorithms for Optimal Decision Trees and Adaptive TSP Problems , 2010, Math. Oper. Res..

[17]  R. Ravi,et al.  A polylogarithmic approximation algorithm for the group Steiner tree problem , 2000, SODA '98.

[18]  Satish Rao,et al.  The k-traveling repairmen problem , 2007, ACM Trans. Algorithms.

[19]  Mukesh K. Mohania,et al.  Decision trees for entity identification: approximation algorithms and hardness results , 2007, TALG.

[20]  Robert D. Nowak,et al.  The Geometry of Generalized Binary Search , 2009, IEEE Transactions on Information Theory.

[21]  Micah Adler,et al.  Approximating Optimal Binary Decision Trees , 2008, APPROX-RANDOM.

[22]  Jeff A. Bilmes,et al.  Average-Case Active Learning with Costs , 2009, ALT.

[23]  Andreas Krause,et al.  Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization , 2010, COLT 2010.

[24]  Ronald L. Graham,et al.  Performance bounds on the splitting algorithm for binary testing , 1974, Acta Informatica.

[25]  Ananthram Swami,et al.  Flying in the dark: controlling autonomous data ferries with partial observations , 2010, MobiHoc '10.

[26]  Nicos Christofides Worst-Case Analysis of a New Heuristic for the Travelling Salesman Problem , 1976, Operations Research Forum.

[27]  Mostafa H. Ammar,et al.  Message ferrying: proactive routing in highly-partitioned wireless ad hoc networks , 2003, The Ninth IEEE Workshop on Future Trends of Distributed Computing Systems, 2003. FTDCS 2003. Proceedings..

[28]  Patrick Jaillet,et al.  A Priori Solution of a Traveling Salesman Problem in Which a Random Subset of the Customers Are Visited , 1988, Oper. Res..

[29]  Robert Krauthgamer,et al.  Polylogarithmic inapproximability , 2003, STOC '03.

[30]  R. Ravi,et al.  Approximation algorithms for sequencing problems , 2009 .

[31]  Teresa M. Przytycka,et al.  On an Optimal Split Tree Problem , 1999, WADS.

[32]  Andreas Krause,et al.  Adaptive Submodularity: Theory and Applications in Active Learning and Stochastic Optimization , 2010, J. Artif. Intell. Res..

[33]  Sudipto Guha,et al.  Information Acquisition and Exploitation in Multichannel Wireless Networks , 2008, ArXiv.

[34]  Satish Rao,et al.  Paths, trees, and minimum latency tours , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[35]  Naveen Garg,et al.  A 3-approximation for the minimum tree spanning k vertices , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[36]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[37]  Hao Yang,et al.  Near-optimal algorithms for shared filter evaluation in data stream systems , 2008, SIGMOD Conference.

[38]  David B. Shmoys,et al.  A Constant Approximation Algorithm for the a prioriTraveling Salesman Problem , 2008, IPCO.

[39]  Madhu Sudan,et al.  The minimum latency problem , 1994, STOC '94.

[40]  Ellen W. Zegura,et al.  A message ferrying approach for data delivery in sparse mobile ad hoc networks , 2004, MobiHoc '04.

[41]  David B. Shmoys,et al.  Algorithms for the universal and a priori TSP , 2008, Oper. Res. Lett..

[42]  Mohammad Taghi Hajiaghayi,et al.  Oblivious network design , 2006, SODA '06.

[43]  Waylon Brunette,et al.  Data MULEs: modeling and analysis of a three-tier architecture for sparse sensor networks , 2003, Ad Hoc Networks.

[44]  Jan Vondrák,et al.  Stochastic Covering and Adaptivity , 2006, LATIN.

[45]  Atri Rudra,et al.  When LP Is the Cure for Your Matching Woes: Improved Bounds for Stochastic Matchings , 2010, Algorithmica.