Redefinition of Decision Rules Based on the Importance of Elementary Conditions Evaluation

The paper presents an algorithm of decision rules redefinition that is based on evaluation of the importance of elementary conditions occurring in induced rules. Standard and simplified heuristic indices of elementary condition importance evaluation are described. There is a comparison of the results obtained by both indices concerning classifiers quality and elementary condition rankings estimated by the indices. The efficiency of the proposed algorithm has been verified on 21 benchmark data sets. Moreover, an analysis of practical applications of the proposed methods for biomedical and medical data analysis is presented. The obtained results show that the redefinition reduces considerably a rule set needed to describe each decision class. Additionally, after the rule set redefinition negated elementary conditions may also occur in new rules.

[1]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[2]  Francisco Herrera,et al.  IFS-CoCo: Instance and feature selection based on cooperative coevolution with nearest neighbor rule , 2010, Pattern Recognit..

[3]  Andrzej Skowron,et al.  A Hierarchical Approach to Multimodal Classification , 2005, RSFDGrC.

[4]  Andrzej Janusz Discovering Rules-Based Similarity in Microarray Data , 2010, IPMU.

[5]  Ryszard S. Michalski,et al.  Hypothesis-Driven Constructive Induction in AQ17-HCI: A Method and Experiments , 1994, Machine Learning.

[6]  Marek Sikora,et al.  Quality improvement of rule-based gene group descriptions using information about GO terms importance occurring in premises of determined rules , 2010, Int. J. Appl. Math. Comput. Sci..

[7]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[8]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[9]  Branko Kavsek,et al.  APRIORI-SD: ADAPTING ASSOCIATION RULE LEARNING TO SUBGROUP DISCOVERY , 2006, IDA.

[10]  Monika Mielcarek,et al.  Higher CD34(+) and CD3(+) cell doses in the graft promote long-term survival, and have no impact on the incidence of severe acute or chronic graft-versus-host disease after in vivo T cell-depleted unrelated donor hematopoietic stem cell transplantation in children. , 2010, Biology of blood and marrow transplantation : journal of the American Society for Blood and Marrow Transplantation.

[11]  Eyke Hüllermeier,et al.  Computational Intelligence for Knowledge-Based Systems Design, 13th International Conference on Information Processing and Management of Uncertainty, IPMU 2010, Dortmund, Germany, June 28 - July 2, 2010. Proceedings , 2010, IPMU.

[12]  Geoffrey I. Webb Further Experimental Evidence against the Utility of Occam's Razor , 1996, J. Artif. Intell. Res..

[13]  Fabrice Guillet,et al.  Quality Measures in Data Mining (Studies in Computational Intelligence) , 2007 .

[14]  Marcin S. Szczuka,et al.  A New Version of Rough Set Exploration System , 2002, Rough Sets and Current Trends in Computing.

[15]  Marek Sikora,et al.  Induction and pruning of classification rules for prediction of microseismic hazards in coal mines , 2011, Expert Syst. Appl..

[16]  Alain Chateauneuf,et al.  Some Characterizations of Lower Probabilities and Other Monotone Capacities through the use of Möbius Inversion , 1989, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[17]  Ivan Bratko,et al.  Machine Learning and Data Mining; Methods and Applications , 1998 .

[18]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[19]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[20]  Marek Sikora,et al.  Induction and selection of the most interesting Gene Ontology based multiattribute rules for descriptions of gene groups , 2011, Pattern Recognit. Lett..

[21]  Salvatore Greco,et al.  Mining Pareto-optimal rules with respect to support and confirmation or support and anti-support , 2007, Eng. Appl. Artif. Intell..

[22]  Peter A. Flach,et al.  Subgroup Discovery with CN2-SD , 2004, J. Mach. Learn. Res..

[23]  Wlodzislaw Duch,et al.  A new methodology of extraction, optimization and application of crisp and fuzzy logical rules , 2001, IEEE Trans. Neural Networks.

[24]  Wojciech Kotlowski,et al.  ENDER: a statistical framework for boosting decision rules , 2010, Data Mining and Knowledge Discovery.

[25]  Hiep Xuan Huynh,et al.  A Graph-based Clustering Approach to Evaluate Interestingness Measures: A Tool and a Comparative Study , 2007, Quality Measures in Data Mining.

[26]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[27]  Howard J. Hamilton,et al.  Interestingness measures for data mining: A survey , 2006, CSUR.

[28]  Arkadiusz Wojna,et al.  RIONA: A New Classification System Combining Rule Induction and Instance-Based Learning , 2002, Fundam. Informaticae.

[29]  Jan Komorowski,et al.  Taming Large Rule Models in Rough Set Approaches , 1999, PKDD.

[30]  Jiye Li,et al.  A method of discovering important rules using rules as attributes , 2010 .

[31]  Salvatore Greco,et al.  Importance and Interaction of Conditions in Decision Rules , 2002, Rough Sets and Current Trends in Computing.

[32]  Ian Witten,et al.  Data Mining , 2000 .

[33]  D. Botstein,et al.  The transcriptional program in the response of human fibroblasts to serum. , 1999, Science.

[34]  J. Stefanowski,et al.  Induction of decision rules in classification and discovery‐oriented perspectives , 2001 .

[35]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[36]  Alex Alves Freitas,et al.  On rule interestingness measures , 1999, Knowl. Based Syst..

[37]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[38]  Yiyu Yao,et al.  Micro and macro evaluation of classification rules , 2008, 2008 7th IEEE International Conference on Cognitive Informatics.

[39]  Andrzej Skowron,et al.  A hierarchical approach to multimode classification , 2005 .

[40]  Ryszard S. Michalski,et al.  The AQ15 Inductive Learning System: An Overview and Experiments , 1986 .

[41]  Igor Kononenko,et al.  Machine Learning and Data Mining: Introduction to Principles and Algorithms , 2007 .

[42]  Jerzy W. Grzymala-Busse,et al.  Data mining based on rough sets , 2003 .

[43]  Shusaku Tsumoto,et al.  Evaluation of rule interestingness measures in medical knowledge discovery in databases , 2007, Artif. Intell. Medicine.

[44]  Nick Cercone,et al.  Rule Quality Measures for Rule Induction Systems: Description and Evaluation , 2001, Comput. Intell..

[45]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[46]  Marek Sikora,et al.  Data-Driven Adaptive Selection of Rules Quality Measures for Improving the Rules Induction Algorithm , 2011, RSFDGrC.

[47]  Shusaku Tsumoto,et al.  Analyzing Behavior of Objective Rule Evaluation Indices Based on Pearson Product-Moment Correlation Coefficient , 2008, ISMIS.

[48]  Marko Robnik-Sikonja,et al.  Explaining Classifications For Individual Instances , 2008, IEEE Transactions on Knowledge and Data Engineering.

[49]  Roman Słowiński,et al.  The Use of Rough Sets and Fuzzy Sets in MCDM , 1999 .

[50]  Johannes Fürnkranz,et al.  On the quest for optimal rule learning heuristics , 2010, Machine Learning.

[51]  Michel Grabisch,et al.  K-order Additive Discrete Fuzzy Measures and Their Representation , 1997, Fuzzy Sets Syst..

[52]  Shusaku Tsumoto,et al.  Visualization of Similarities and Dissimilarities in Rules Using Multidimensional Scaling , 2005, ISMIS.

[53]  Josef Tkadlec,et al.  Rule quality for multiple-rule classifier: Empirical expertise and theoretical methodology , 2003, Intell. Data Anal..

[54]  Geoffrey I. Webb Discovering significant patterns , 2008, Machine Learning.

[55]  Jerzy Stefanowski,et al.  The Bagging and n2-Classifiers Based on Rules Induced by MODLEM , 2004, Rough Sets and Current Trends in Computing.

[56]  Kenneth McGarry,et al.  A survey of interestingness measures for knowledge discovery , 2005, The Knowledge Engineering Review.

[57]  Marek Sikora,et al.  Data-driven adaptive selection of rule quality measures for improving rule induction and filtration algorithms , 2013, Int. J. Gen. Syst..

[58]  Marek Sikora,et al.  Decision Rule-Based Data Models Using TRS and NetTRS - Methods and Algorithms , 2010, Trans. Rough Sets.

[59]  Fabrice Guillet,et al.  Quality Measures in Data Mining , 2009, Studies in Computational Intelligence.

[60]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[61]  Johannes Fürnkranz,et al.  ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.

[62]  Tharam S. Dillon,et al.  Interestingness measures for association rules based on statistical validity , 2011, Knowl. Based Syst..

[63]  Adam Mrózek,et al.  Rough Sets in Computer Implementation of Rule-Based Control of Industrial Processes , 1992, Intelligent Decision Support.