Decision tree methods constitute an important and much used technique for classification problems. When such trees are used in a Datamining and Knowledge Discovery context, ease of interpretation of the resulting trees is an important requirement to be met. Decision trees with tests based on a single variable, as produced by methods such as ID3, C4.5 etc., often require a large number of tests to achieve an acceptable accuracy. This makes interpretation of these trees, which is an important reason for their use, disputable. Recently, a number of methods for constructing decision trees with multivariate tests have been presented. Multivariate decision trees are often smaller and more accurate than univariate trees; however, the use of linear combinations of the variables may result in trees that are hard to interpret. In this paper we consider trees with test bases on combinations of at most two variables. We show that bivariate decision trees are an interesting alternative to both uni- and multivariate trees, especially qua ease of interpretation.
[1]
Simon Kasif,et al.
OC1: A Randomized Induction of Oblique Decision Trees
,
1993,
AAAI.
[2]
Usama M. Fayyad,et al.
Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning
,
1993,
IJCAI.
[3]
Carla E. Brodley,et al.
Linear Machine Decision Trees
,
1991
.
[4]
J. Ross Quinlan,et al.
Improved Use of Continuous Attributes in C4.5
,
1996,
J. Artif. Intell. Res..
[5]
Leo Breiman,et al.
Classification and Regression Trees
,
1984
.
[6]
Jan M. Zytkow,et al.
Automated Discovery of Empirical Laws
,
1996,
Fundam. Informaticae.
[7]
Ronald J. Brachman,et al.
The Process of Knowledge Discovery in Databases
,
1996,
Advances in Knowledge Discovery and Data Mining.
[8]
Brian D. Ripley,et al.
Pattern Recognition and Neural Networks
,
1996
.
[9]
Steven L. Salzberg,et al.
On growing better decision trees from data
,
1996
.
[10]
B W Koehn,et al.
Experimenting and theorizing in theory formation
,
1986,
ISMIS '86.