Introducing a Rule Importance Measure

Association rule algorithms often generate an excessive number of rules, many of which are not significant. It is difficult to determine which rules are more useful, interesting and important. We introduce a rough set based Rule Importance Measure to select the most important rules. We use ROSETTA software to generate multiple reducts. Apriori association rule algorithm is then applied to generate rule sets for each data set based on each reduct. Some rules are generated more frequently than the others among the total rule sets. We consider such rules as more important. We define rule importance as the frequency of an association rule generated across all the rule sets. Rule importance is different from either rule interestingness measures or rule quality measures because of their application tasks, the processes where the measures are applied and the contents they measure. The experimental results from an artificial data set, UCI machine learning datasets and an actual geriatric care medical data set show that our method reduces the computational cost for rule generation and provides an effective measure of how important is a rule.

[1]  S. Tsumoto,et al.  Rough set methods and applications: new developments in knowledge discovery in information systems , 2000 .

[2]  Nick Cercone,et al.  Rule Quality Measures for Rule Induction Systems: Description and Evaluation , 2001, Comput. Intell..

[3]  Zhe Huang,et al.  Applying AI technology and rough set theory to mine association rules for supporting knowledge management , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[4]  Gholamreza Nakhaeizadeh,et al.  Machine learning and statistics: the interface , 1996 .

[5]  Aboul Ella Hassanien,et al.  Rough set approach for attribute reduction and rule generation: A case of patients with suspected breast cancer , 2004, J. Assoc. Inf. Sci. Technol..

[6]  Marcin S. Szczuka,et al.  A New Version of Rough Set Exploration System , 2002, Rough Sets and Current Trends in Computing.

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Jiye Li,et al.  A Rough Set Based Model to Rank the Importance of Association Rules , 2005, RSFDGrC.

[9]  Jan G. Bazan,et al.  Rough set algorithms in classification problem , 2000 .

[10]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[11]  Xiaohua Hu Knowledge discovery in databases: an attribute-oriented rough set approach , 1996 .

[12]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[13]  Christian Borgelt,et al.  EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[14]  Marzena Kryszkiewicz,et al.  Finding Reducts in Composed Information Systems , 1993, RSKD.

[15]  Sergio A. Alvarez,et al.  Efficient Adaptive-Support Association Rule Mining for Recommender Systems , 2004, Data Mining and Knowledge Discovery.

[16]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[17]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[18]  Sadaaki Miyamoto,et al.  Rough Sets and Current Trends in Computing , 2012, Lecture Notes in Computer Science.

[19]  Staal A. Vinterbo,et al.  Minimal approximate hitting sets and rule templates , 2000, Int. J. Approx. Reason..

[20]  Bin Tang,et al.  Applying Association Rules for Interesting Recommendations Using Rule Templates , 2004, PAKDD.

[21]  Aleksander Øhrn,et al.  Discernibility and Rough Sets in Medicine: Tools and Applications , 2000 .

[22]  Tsau Young Lin,et al.  A New Rough Sets Model Based on Database Systems , 2003, Fundam. Informaticae.

[23]  Nick Cercone,et al.  ELEM2: A Learning System for More Accurate Classifications , 1998, Canadian Conference on AI.

[24]  Aleksander Øhrn ROSETTA Technical Reference Manual , 2001 .