ForEx++: A New Framework for Knowledge Discovery from Decision Forests

Decision trees are popularly used in a wide range of real world problems for both prediction and classification (logic) rules discovery. A decision forest is an ensemble of decision trees and it is often built for achieving better predictive performance compared to a single decision tree. Besides improving predictive performance, a decision forest can be seen as a pool of logic rules (rules) with great potential for knowledge discovery. However, a standard-sized decision forest usually generates a large number of rules that a user may not able to manage for effective knowledge analysis. In this paper, we propose a new, data set independent framework for extracting those rules that are comparatively more accurate, generalized and concise than others. We apply the proposed framework on rules generated by two different decision forest algorithms from some publicly available medical related data sets on dementia and heart disease. We then compare the quality of rules extracted by the proposed framework with rules generated from a single J48 decision tree and rules extracted by another recent method. The results reported in this paper demonstrate the effectiveness of the proposed framework.

[1]  Gürdal Ertek,et al.  Risk Factors and Identifiers for Alzheimer's Disease: A Data Mining Analysis , 2014, ICDM.

[2]  Bart Baesens,et al.  Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring , 2008, Rule Extraction from Support Vector Machines.

[3]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[4]  Bart Baesens,et al.  Using Rule Extraction to Improve the Comprehensibility of Predictive Models , 2006 .

[5]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[6]  Alberto Suárez,et al.  Aggregation Ordering in Bagging , 2004 .

[7]  S. Folstein,et al.  "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician. , 1975, Journal of psychiatric research.

[8]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[9]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[10]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[11]  Xindong Wu,et al.  Ensemble pruning via individual contribution ordering , 2010, KDD.

[12]  Md Zahidul Islam,et al.  Optimizing the number of trees in a decision forest to discover a subforest with high ensemble accuracy using a genetic algorithm , 2016, Knowl. Based Syst..

[13]  Omar H. Karam,et al.  Feature Analysis of Coronary Artery Heart Disease Data Sets , 2015 .

[14]  Md Zahidul Islam,et al.  One-vs-all binarization technique in the context of random forest , 2015, ESANN.

[15]  Md Zahidul Islam,et al.  Forest PA: Constructing a decision forest by penalizing attributes used in previous trees , 2017, Expert Syst. Appl..

[16]  Tuve Löfström,et al.  One tree to explain them all , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[17]  Nasim Adnan,et al.  Knowledge discovery from a data set on dementia through decision forest , 2016 .

[18]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[19]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[20]  Laurent Heutte,et al.  Forest-RK: A New Random Forest Induction Method , 2008, ICIC.

[21]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[22]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[23]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[24]  Sheng Liu,et al.  Combined Rule Extraction and Feature Elimination in Supervised Classification , 2012, IEEE Transactions on NanoBioscience.

[25]  Gonzalo Martínez-Muñoz,et al.  Out-of-bag estimation of the optimal sample size in bagging , 2010, Pattern Recognit..

[26]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[27]  Robin Gras,et al.  Rule Extraction from Random Forest: the RF+HC Methods , 2015, Canadian Conference on AI.

[28]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[29]  Daniel Hernández-Lobato,et al.  An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Nasim Adnan,et al.  ComboSplit: Combining Various Splitting Criteria for Building a Single Decision Tree , 2014 .

[31]  Peter Scarborough,et al.  Cardiovascular disease in Europe 2014: epidemiological update. , 2013, European heart journal.

[32]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Madhuri Jha ANN-DT : An Algorithm for Extraction of Decision Trees from Artificial Neural Networks , 2013 .

[34]  A. B. Hollingshead,et al.  Four factor index of social status , 1975 .

[35]  Md Zahidul Islam,et al.  Forest CERN: A New Decision Forest Building Technique , 2016, PAKDD.

[36]  Rob Stocker,et al.  Using Decision Tree for Diagnosing Heart Disease Patients , 2011, AusDM.

[37]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[38]  Md Zahidul Islam,et al.  Complement Random Forest , 2015, AusDM.

[39]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[40]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[41]  Laurent Heutte,et al.  Dynamic Random Forests , 2012, Pattern Recognit. Lett..

[42]  Abraham Z. Snyder,et al.  A unified approach for morphometric and functional data analysis in young, old, and demented adults using automated atlas-based head size normalization: reliability and validation against manual measurement of total intracranial volume , 2004, NeuroImage.

[43]  Md. Nasim Adnan,et al.  Improving the random forest algorithm by randomly varying the size of the bootstrap samples , 2014, Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014).

[44]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[45]  Benjamin J. Shannon,et al.  Molecular, Structural, and Functional Characterization of Alzheimer's Disease: Evidence for a Relationship between Default Activity, Amyloid, and Memory , 2005, The Journal of Neuroscience.