ReG-Rules: An Explainable Rule-Based Ensemble Learner for Classification

The learning of classification models to predict class labels of new and previously unseen data instances is one of the most essential tasks in data mining. A popular approach to classification is ensemble learning, where a combination of several diverse and independent classification models is used to predict class labels. Ensemble models are important as they tend to improve the average classification accuracy over any member of the ensemble. However, classification models are also often required to be explainable to reduce the risk of irreversible wrong classification. Explainability of classification models is needed in many critical applications such as stock market analysis, credit risk evaluation, intrusion detection, etc. Unfortunately, ensemble learning decreases the level of explainability of the classification, as the analyst would have to examine many decision models to gain insights about the causality of the prediction. The aim of the research presented in this paper is to create an ensemble method that is explainable in the sense that it presents the human analyst with a conditioned view of the most relevant model aspects involved in the prediction. To achieve this aim the authors developed a rule-based explainable ensemble classifier termed Ranked ensemble G-Rules (ReG-Rules) which gives the analyst an extract of the most relevant classification rules for each individual prediction. During the evaluation process ReG-Rules was evaluated in terms of its theoretical computational complexity, empirically on benchmark datasets and qualitatively with respect to the complexity and readability of the induced rule sets. The results show that ReG-Rules scales linearly, delivers a high accuracy and at the same time delivers a compact and manageable set of rules describing the predictions made.

[1]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[2]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[3]  Richard A. Levine,et al.  Ensemble Learning for Estimating Individualized Treatment Effects in Student Success Studies , 2017, International Journal of Artificial Intelligence in Education.

[4]  Hamed R. Bonab,et al.  Less Is More: A Comprehensive Framework for the Number of Components of Ensemble Classifiers , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[5]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[6]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[7]  José Augusto Baranauskas,et al.  How Many Trees in a Random Forest? , 2012, MLDM.

[8]  Ryszard S. Michalski,et al.  On the Quasi-Minimal Solution of the General Covering Problem , 1969 .

[9]  W. Marsden I and J , 2012 .

[10]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[11]  Lidong Wang,et al.  A parallel fuzzy rule-base based decision tree in the framework of map-reduce , 2020, Pattern Recognit..

[12]  Max Bramer,et al.  A Rule-Based Classifier with Accurate and Fast Rule Term Induction for Continuous Attributes , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[13]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[14]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[15]  Hua Han,et al.  Novel application of multi-model ensemble learning for fault diagnosis in refrigeration systems , 2020 .

[16]  Johannes Fürnkranz,et al.  Foundations of Rule Learning , 2012, Cognitive Technologies.

[17]  U. Rajendra Acharya,et al.  Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring , 2019, Appl. Soft Comput..

[18]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[19]  Max Bramer,et al.  Towards Expressive Modular Rule Induction for Numerical Attributes , 2016, SGAI Conf..

[20]  Ching Y. Suen,et al.  Application of majority voting to pattern recognition: an analysis of its behavior and performance , 1997, IEEE Trans. Syst. Man Cybern. Part A.

[21]  Charu C. Aggarwal,et al.  Data Mining: The Textbook , 2015 .

[22]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[23]  Max Bramer,et al.  Random Prism: a noise-tolerant alternative to Random Forests , 2014, Expert Syst. J. Knowl. Eng..

[24]  Mohamed Medhat Gaber,et al.  A Scalable Expressive Ensemble Learning Using Random Prism: A MapReduce Approach , 2015, Trans. Large Scale Data Knowl. Centered Syst..

[25]  Max Bramer,et al.  Inducer: a public domain workbench for data mining , 2005, Int. J. Syst. Sci..

[26]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[27]  Kai Chen,et al.  A Novel Feature Selection-Based Sequential Ensemble Learning Method for Class Noise Detection in High-Dimensional Data , 2018, ADMA.

[28]  Marcel Abendroth,et al.  Data Mining Practical Machine Learning Tools And Techniques With Java Implementations , 2016 .

[29]  Hieu Pham,et al.  Bagged ensembles with tunable parameters , 2018, Comput. Intell..

[30]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[31]  Vic Murray,et al.  About the Textbook , 2014 .

[32]  Maryam Sabzevari,et al.  Vote-boosting ensembles , 2016, Pattern Recognit..

[33]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[34]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[35]  Max Bramer,et al.  An Information-Theoretic Approach to the Pre-pruning of Classification Rules , 2002, Intelligent Information Processing.

[36]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[37]  Giuseppe Di Fatta,et al.  Computationally Efficient Rule-Based Classification for Continuous Streaming Data , 2014, SGAI Conf..

[38]  Max Bramer,et al.  Improving Modular Classification Rule Induction with G-Prism Using Dynamic Rule Term Boundaries , 2017, SGAI Conf..

[39]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[40]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[41]  Dirk Van,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[42]  Rashmi Agrawal,et al.  Integrated Parallel K-Nearest Neighbor Algorithm , 2018, Smart Intelligent Computing and Applications.

[43]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[44]  Anil K. Bera,et al.  Efficient tests for normality, homoscedasticity and serial independence of regression residuals , 1980 .

[45]  Fernando Luiz Cyrino Oliveira,et al.  Forecasting mid-long term electric energy consumption through bagging ARIMA and exponential smoothing methods , 2018 .

[46]  C. Walck Hand-book on statistical distributions for experimentalists , 1996 .

[47]  Chi-Man Vong,et al.  Accurate and efficient sequential ensemble learning for highly imbalanced multi-class data , 2020, Neural Networks.

[48]  Plamen P. Angelov,et al.  A Massively Parallel Deep Rule-Based Ensemble Classifier for Remote Sensing Scenes , 2018, IEEE Geoscience and Remote Sensing Letters.

[49]  Lior Rokach,et al.  Pattern Classification Using Ensemble Methods , 2009, Series in Machine Perception and Artificial Intelligence.

[50]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[51]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[52]  R. Geary,et al.  Testing for Normality , 2003 .

[53]  Huan Liu,et al.  An Empirical Study of Building Compact Ensembles , 2004, WAIM.