A framework for bottom-up induction of oblique decision trees

Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes, whereas univariate trees can only perform axis-parallel splits. The vast majority of the oblique and univariate decision-tree induction algorithms employ a top-down strategy for growing the tree, relying on an impurity-based measure for splitting nodes. In this paper, we propose BUTIF-a novel Bottom-Up Oblique Decision-Tree Induction Framework. BUTIF does not rely on an impurity-measure for dividing nodes, since the data resulting from each split is known a priori. For generating the initial leaves of the tree and the splitting hyperplanes in its internal nodes, BUTIF allows the adoption of distinct clustering algorithms and binary classifiers, respectively. It is also capable of performing embedded feature selection, which may reduce the number of features in each hyperplane, thus improving model comprehension. Different from virtually every top-down decision-tree induction algorithm, BUTIF does not require the further execution of a pruning procedure in order to avoid overfitting, due to its bottom-up nature that does not overgrow the tree. We compare distinct instances of BUTIF to traditional univariate and oblique decision-tree induction algorithms. Empirical results show the effectiveness of the proposed framework.

[1]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[2]  João Gama,et al.  Functional Trees , 2001, Machine Learning.

[3]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[4]  Kevin W. Boyack,et al.  Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches , 2011, PloS one.

[5]  Alex Alves Freitas,et al.  LEGAL-tree: a lexicographic multi-objective genetic algorithm for decision tree induction , 2009, SAC '09.

[6]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A bottom-up oblique decision tree induction algorithm , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[7]  Alex Alves Freitas,et al.  Adapting non-hierarchical multilabel classification methods for hierarchical multilabel classification , 2011, Intell. Data Anal..

[8]  Rodrigo C. Barros,et al.  Evolutionary model trees for handling continuous classes in machine learning , 2011, Inf. Sci..

[9]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[10]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  A genetic algorithm for Hierarchical Multi-Label Classification , 2012, SAC '12.

[11]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  Lior Rokach,et al.  Top-down induction of decision trees classifiers - a survey , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[14]  Ricardo J. G. B. Campello,et al.  Relative clustering validity criteria: A comparative overview , 2010, Stat. Anal. Data Min..

[15]  Ian Witten,et al.  Data Mining , 2000 .

[16]  Alex Alves Freitas,et al.  A hyper-heuristic evolutionary algorithm for automatically designing decision-tree algorithms , 2012, GECCO '12.

[17]  Carla E. Brodley,et al.  Multivariate decision trees , 2004, Machine Learning.

[18]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[19]  Alex Alves Freitas,et al.  Towards the automatic design of decision tree induction algorithms , 2011, GECCO.

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[21]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[22]  Ricardo J. G. B. Campello,et al.  A Comparative Study on the Use of Correlation Coefficients for Redundant Feature Elimination , 2010, 2010 Eleventh Brazilian Symposium on Neural Networks.

[23]  Carla E. Brodley,et al.  An Incremental Method for Finding Multivariate Splits for Decision Trees , 1990, ML.

[24]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[25]  S. Drăghici,et al.  Analysis of microarray experiments of gene expression profiling. , 2006, American journal of obstetrics and gynecology.

[26]  Alex Alves Freitas,et al.  Lexicographic multi-objective evolutionary induction of decision trees , 2009, Int. J. Bio Inspired Comput..

[27]  Simon Kasif,et al.  Induction of Oblique Decision Trees , 1993, IJCAI.

[28]  Ricardo J. G. B. Campello,et al.  Evaluating Correlation Coefficients for Clustering Gene Expression Profiles of Cancer , 2012, BSB.

[29]  Alexander Schliep,et al.  Clustering cancer gene expression data: a comparative study , 2008, BMC Bioinformatics.

[30]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[31]  David A. Landgrebe,et al.  Hierarchical classifier design in high-dimensional numerous class cases , 1991, IEEE Trans. Geosci. Remote. Sens..

[32]  Michael Schlosser,et al.  Non-Linear Decision Trees - NDT , 1996, ICML.

[33]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[34]  Huan Liu,et al.  Feature Selection: An Ever Evolving Frontier in Data Mining , 2010, FSDM.

[35]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[36]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[37]  Charu C. Aggarwal,et al.  An Introduction to Cluster Analysis , 2018, Data Clustering: Algorithms and Applications.

[38]  V. Menkovski,et al.  Oblique Decision Trees using embedded Support Vector Machines in classifier ensembles , 2008, 2008 7th IEEE International Conference on Cybernetic Intelligent Systems.

[39]  Donato Malerba,et al.  A Comparative Analysis of Methods for Pruning Decision Trees , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Hierarchical multi-label classification for protein function prediction: A local approach based on neural networks , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[41]  Lei Chang,et al.  BOAI: Fast Alternating Decision Tree Induction Based on Bottom-Up Evaluation , 2008, PAKDD.

[42]  Simon Kasif,et al.  OC1: A Randomized Induction of Oblique Decision Trees , 1993, AAAI.

[43]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[44]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[45]  G. H. Landeweerd,et al.  Binary tree versus single level tree classification of white blood cells , 1983, Pattern Recognit..

[46]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data , 2012, BMC Bioinformatics.

[47]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[48]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[49]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[50]  Alex Alves Freitas,et al.  A Survey of Evolutionary Algorithms for Decision-Tree Induction , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[51]  Ricardo J. G. B. Campello,et al.  Relative clustering validity criteria: A comparative overview , 2010 .

[52]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[53]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[54]  Alex Alves Freitas,et al.  Evolutionary model tree induction , 2010, SAC '10.