Inducing Readable Oblique Decision Trees

Although machine learning models are found in more and more practical applications, stakeholders can be suspicious about the fact that they are not hard-coded and fully specified. To foster trust, it is crucial to provide models whose predictions are explainable. Decision Trees can be understood by humans if they are simple enough, but they suffer in accuracy when compared to other common machine learning methods. Oblique Decision Trees can provide better accuracy and smaller trees, but their decision rules are more complex. This article presents MUST (Multivariate Understandable Statistical Tree), an Oblique Decision Tree split algorithm based on Linear Discriminant Analysis that aims to preserve explainability by limiting the number of variables that appear in decision rules.

[1]  Michael Schlosser,et al.  Non-Linear Decision Trees - NDT , 1996, ICML.

[2]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[3]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[4]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[5]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[6]  Farid García,et al.  Fisher's decision tree , 2013, Expert Syst. Appl..

[7]  C. J. Price,et al.  HHCART: An oblique decision tree , 2015, Comput. Stat. Data Anal..

[8]  Naresh Manwani,et al.  Geometric Decision Tree , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[10]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[11]  Ming Dong,et al.  Classifiability based omnivariate decision trees , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[12]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[13]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[14]  Hyunjoong Kim,et al.  Classification Trees With Unbiased Multiway Splits , 2001 .

[15]  Dejan Gjorgjevikj,et al.  A Multi-class SVM Classifier Utilizing Binary Decision Tree , 2009, Informatica.

[16]  Chandrika Kamath,et al.  Inducing oblique decision trees with evolutionary algorithms , 2003, IEEE Trans. Evol. Comput..

[17]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[18]  Ethem Alpaydin,et al.  Omnivariate decision trees , 2001, IEEE Trans. Neural Networks.

[19]  Margo I. Seltzer,et al.  Scalable Bayesian Rule Lists , 2016, ICML.

[20]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[21]  M. Boussard,et al.  Information gain ratio correction: Improving prediction with more balanced decision tree splits , 2018, 1801.08310.