Induction of classification rules by Gini-index based rule generation

Abstract Rule learning is one of the most popular areas in machine learning research, because the outcome of learning is to produce a set of rules, which not only provides accurate predictions but also shows a transparent process of mapping inputs to outputs. In general, rule learning approaches can be divided into two main types, namely, ‘divide and conquer’ and ‘separate and conquer’. The former type of rule learning is also known as Top-Down Induction of Decision Trees, which means to learn a set of rules represented in the form of a decision tree. This approach results in the production of a large number of complex rules (usually due to the replicated sub-tree problem), which lowers the computational efficiency in both the training and testing stages, and leads to the overfitting of training data. Due to this problem, researchers have been gradually motivated to develop ‘separate and conquer’ rule learning approaches, also known as covering approaches, by learning a set of rules on a sequential basis. In particular, a rule is learned and the instances covered by this rule are deleted from the training set, such that the learning of the next rule is based on a smaller training set. In this paper, we propose a new algorithm, GIBRG, which employs Gini-index to measure the quality of each rule being learned, in the context of ‘separate and conquer’ rule learning. Our experiments show that the proposed algorithm outperforms both decision tree learning algorithms (C4.5 and CART) and ‘separate and conquer’ approaches (Prism). In addition, it also leads to a smaller number of rules and rule terms, thus being more computationally efficient and less prone to overfitting.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Han Liu,et al.  Induction of Modular Classification Rules by Information Entropy Based Rule Generation , 2016 .

[3]  Han Liu,et al.  Interpretability of Computational Models for Sentiment Analysis , 2016, Sentiment Analysis and Ontology Engineering.

[4]  Han Liu,et al.  J-measure based hybrid pruning for complexity reduction in classification rules , 2013 .

[5]  Robert Ivor John,et al.  Interval-valued fuzzy decision trees with optimal neighbourhood perimeter , 2014, Appl. Soft Comput..

[6]  Han Liu,et al.  Simplification of Classification Rules , 2016 .

[7]  Hong Zhao,et al.  A cost sensitive decision tree algorithm with two adaptive mechanisms , 2015, Knowl. Based Syst..

[8]  Tapio Elomaa,et al.  An Analysis of Reduced Error Pruning , 2001, J. Artif. Intell. Res..

[9]  Han Liu,et al.  Complexity Control in Rule Based Models for Classification in Machine Learning Context , 2016, UKCI.

[10]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[11]  Han Liu,et al.  Fuzzy information granulation towards interpretable sentiment analysis , 2017, GRC 2017.

[12]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[13]  Robert Ivor John,et al.  Interval-valued fuzzy decision trees , 2010, International Conference on Fuzzy Systems.

[14]  Han Liu,et al.  Generation of Classification Rules , 2016 .

[15]  Donato Malerba,et al.  Simplifying Decision Trees by Pruning and Grafting: New Results (Extended Abstract) , 1995, ECML.

[16]  Sudhir Gupta,et al.  Case Studies , 2013, Journal of Clinical Immunology.

[17]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[18]  Hong Zhao,et al.  A cost sensitive decision tree algorithm based on weighted class distribution with batch deleting attribute mechanism , 2017, Inf. Sci..

[19]  Han Liu,et al.  Rule Based Systems for Big Data , 2015 .

[20]  Igor Kononenko,et al.  Machine Learning and Data Mining: Introduction to Principles and Algorithms , 2007 .

[21]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[22]  Han Liu,et al.  Network based rule representation for knowledge discovery and predictive modelling , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[23]  J. Ross Quinlan,et al.  Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[24]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[25]  Keeley A. Crockett,et al.  On predicting learning styles in conversational intelligent tutoring systems using fuzzy decision trees , 2017, Int. J. Hum. Comput. Stud..

[26]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[27]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[28]  Ayca Altay,et al.  Fuzzy Decision Trees , 2016 .

[29]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[30]  Maryam Tayefi,et al.  The application of a decision tree to establish the parameters associated with hypertension , 2017, Comput. Methods Programs Biomed..

[31]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[32]  Ali Alkhalifah,et al.  Urdu text classification using decision trees , 2015, 2015 12th International Conference on High-capacity Optical Networks and Enabling/Emerging Technologies (HONET).

[33]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[34]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[35]  Yu Christine Chen,et al.  Transient stability assessment via decision trees and multivariate adaptive regression splines , 2017 .

[36]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[37]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[38]  William Zhu,et al.  A Competition Strategy to Cost-Sensitive Decision Trees , 2012, RSKT.

[39]  Han Liu,et al.  Ensemble Learning Approaches , 2016 .

[40]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[41]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .