We give a ln(n) + 1-approximation for the decision tree (DT) problem. We also show that DT does not have a PTAS unless P=NP. An instance of DT is a set of m binary tests T = (T1, . . . , Tm) and a set of n items X = (X1, . . . , Xn). The goal is to output a binary tree where each internal node is a test, each leaf is an item and the average number of tests used to uniquely identify each item (or equivalently, the total external path length) is minimized. In addition, we show DT does not have a PTAS unless P=NP. DT has a rich history in computer science with applications ranging from medical diagnosis to experiment design. Our work, while providing the first nontrivial upper and lower bounds on approximating DT, also demonstrates that DT and a subtly different problem which also bears the name decision tree have fundamentally different approximation complexity. In addition, we show a connection between ConDT and a third type of decision tree problem called MinDT, which allows us to show that no 2 δ(n)-approximation exists for MinDT, for δ < 1, unless NP is quasi-polynomial.
[1]
David S. Johnson,et al.
Computers and Intractability: A Guide to the Theory of NP-Completeness
,
1978
.
[2]
Detlef Sieling,et al.
Minimization of decision trees is hard to approximate
,
2003,
18th IEEE Annual Conference on Computational Complexity, 2003. Proceedings..
[3]
Mark Braverman,et al.
Learnability and automatizability
,
2004,
45th Annual IEEE Symposium on Foundations of Computer Science.
[4]
Giorgio Gambosi,et al.
Complexity and Approximation
,
1999,
Springer Berlin Heidelberg.
[5]
Pierluigi Crescenzi,et al.
A compendium of NP optimization problems
,
1994,
WWW Spring 1994.
[6]
Ronald L. Rivest,et al.
Constructing Optimal Binary Decision Trees is NP-Complete
,
1976,
Inf. Process. Lett..
[7]
Tao Jiang,et al.
Lower Bounds on Learning Decision Lists and Trees
,
1995,
Inf. Comput..
[8]
Steven L. Salzberg,et al.
On growing better decision trees from data
,
1996
.
[9]
Bernard M. E. Moret,et al.
Decision Trees and Diagrams
,
1982,
CSUR.