We propose a classification algorithm that extends linear classifiers for binary classification problems by looking for possible later splits to deal with remote clusters. These additional splits are searched for in directions given by several eigen transformations. The resulting structure is a tree that possesses unique properties that allow, during the construction of the classifier, the use of criteria that are more directly related to classification power than is the case with traditional classification trees. We show that the algorithm produces classifiers equivalent to linear classifiers where these latter are optimal, and otherwise offer higher flexibility while being more robust than traditional classification trees. It is shown how the classification algorithm can outperform traditional classification algorithms on a real life example. The new classifiers retain the level of interpretability of linear classifiers and traditional classification trees unavailable with more complex classifiers. Additionally, they not only allow to easily identify the main properties of the separate classes, but also to identify properties of potential subclasses.
[1]
Steven De Bruyne,et al.
Dimensionality Reduction for Classification ? Comparison of Techniques and Dimension Choice
,
2008
.
[2]
Catherine Blake,et al.
UCI Repository of machine learning databases
,
1998
.
[3]
R. Fisher.
THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS
,
1936
.
[4]
H. Hotelling.
Analysis of a complex of statistical variables into principal components.
,
1933
.
[5]
Leo Breiman,et al.
Classification and Regression Trees
,
1984
.
[6]
Vladimir N. Vapnik,et al.
The Nature of Statistical Learning Theory
,
2000,
Statistics for Engineering and Information Science.
[7]
Aiko M. Hormann,et al.
Programs for Machine Learning. Part I
,
1962,
Inf. Control..
[8]
Ian H. Witten,et al.
Data mining: practical machine learning tools and techniques, 3rd Edition
,
1999
.