论文信息 - On Learning Decision Trees with Large Output Domains

On Learning Decision Trees with Large Output Domains

Abstract. For two disjoint sets of variables, X and Y , and a class of functions C , we define DT(X,Y,C) to be the class of all decision trees over X whose leaves are functions from C over Y . We study the learnability of DT(X,Y,C) using membership and equivalence queries. Boolean decision trees, $DT(X,\emptyset,\{0,1\})$ , were shown to be exactly learnable by Bshouty but does this imply the learnability of decision trees that have nonboolean leaves? A simple encoding of all possible leaf values will work provided that the size of C is reasonable. Our investigation involves several cases where simple encoding is not feasible, i.e., when |C| is large. We show how to learn decision trees whose leaves are learnable concepts belonging to a class C , DT(X,Y,C) , when the separation between the variables X and Y is known. A simple algorithm for decision trees whose leaves are constants, $DT(X, \emptyset, C)$ , is also presented. Each case above requires at least s separate executions of the algorithm due to Bshouty where s is the number of distinct leaves of the tree but we show that if C is a bounded lattice, $DT(X,\emptyset,C)$ is learnable using only one execution of this algorithm.

[1] Thomas R. Hancock,et al. Learning 2µ DNF Formulas and kµ Decision Trees , 1991, COLT.

[2] Brian A. Davey,et al. An Introduction to Lattices and Order , 1989 .

[3] Marek Karpinski,et al. Learning read-once formulas with queries , 1993, JACM.

[4] David Haussler,et al. Learning decision trees from random examples , 1988, COLT '88.

[5] Thomas R. Hancock. Learning kμ decision trees on the uniform distribution , 1993, COLT '93.

[6] D. Angluin. Queries and Concept Learning , 1988 .

[7] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.

[8] Eyal Kushilevitz,et al. Learning decision trees using the Fourier spectrum , 1991, STOC '91.

[9] Avrim Blum. Rank-r Decision Trees are a Subclass of r-Decision Lists , 1992, Inf. Process. Lett..

[10] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[11] Nader H. Bshouty,et al. Exact learning via the Monotone theory , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[12] Thomas Raysor Hancock. The complexity of learning formulas and decision trees that have restricted reads , 1992 .