Knowledge-Based Interactive Postmining of Association Rules Using Ontologies

In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. To overcome this drawback, several methods were proposed in the literature such as itemset concise representations, redundancy reduction, and postprocessing. However, being generally based on statistical information, most of these methods do not guarantee that the extracted rules are interesting for the user. Thus, it is crucial to help the decision-maker with an efficient postprocessing step in order to reduce the number of rules. This paper proposes a new interactive approach to prune and filter discovered rules. First, we propose to use ontologies in order to improve the integration of user knowledge in the postprocessing task. Second, we propose the Rule Schema formalism extending the specification language proposed by Liu et al. for user expectations. Furthermore, an interactive framework is designed to assist the user throughout the analyzing task. Applying our new approach over voluminous sets of rules, we were able, by integrating domain expert knowledge in the postprocessing step, to reduce the number of rules to several dozens or less. Moreover, the quality of the filtered rules was validated by the domain expert at various points in the interactive process.

[1]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[2]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[3]  Ke Wang,et al.  Visually Aided Exploration of Interesting Association Rules , 1999, PAKDD.

[4]  Henrik Eriksson,et al.  Knowledge modeling at the millennium : The design and evolution of Protégé-2000 , 1999 .

[5]  Michael Uschold,et al.  Ontologies: principles, methods and applications , 1996, The Knowledge Engineering Review.

[6]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[7]  Tomasz Imielinski,et al.  DataMine: Application Programming Interface and Query Language for Database Mining , 1996, KDD.

[8]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[9]  Bart Baesens,et al.  Post-Processing of Association Rules , 2009 .

[10]  Gregory Piatetsky-Shapiro,et al.  The interestingness of deviations , 1994 .

[11]  Elena Baralis,et al.  Designing Templates for Mining Association Rules , 2004, Journal of Intelligent Information Systems.

[12]  Balaji Padmanabhan,et al.  Unexpectedness as a Measure of Interestingness in Knowledge Discovery , 1999, Decis. Support Syst..

[13]  Jiuyong Li,et al.  On optimal rule discovery , 2006, IEEE Transactions on Knowledge and Data Engineering.

[14]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[15]  Kate Smith-Miles,et al.  Redundant association rules reduction techniques , 2007, Int. J. Bus. Intell. Data Min..

[16]  Lars Marius Garshol,et al.  Metadata? Thesauri? Taxonomies? Topic Maps! Making Sense of it all , 2004, J. Inf. Sci..

[17]  Xiangji Huang,et al.  Objective and subjective algorithms for grouping association rules , 2003, Third IEEE International Conference on Data Mining.

[18]  James Geller,et al.  Raising, to Enhance Rule Mining in Web Marketing with the Use of an Ontology , 2008 .

[19]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[20]  Edward Omiecinski,et al.  Alternative Interest Measures for Mining Associations in Databases , 2003, IEEE Trans. Knowl. Data Eng..

[21]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[22]  Jan Rauch,et al.  Roles of Medical Ontology in Association Mining CRISP-DM Cycle , 2004 .

[23]  Neil A. Ernst,et al.  Jambalaya: an interactive environment for exploring ontologies , 2002, IUI '02.

[24]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[25]  Fabrice Guillet,et al.  Quality Measures in Data Mining , 2009, Studies in Computational Intelligence.

[26]  Daniel Xodo,et al.  Data Mining With Ontologies: Implementations, Findings and Frameworks , 2007 .

[27]  Mohammed J. Zaki,et al.  Efficient algorithms for mining closed itemsets and their lattice structure , 2005, IEEE Transactions on Knowledge and Data Engineering.

[28]  Fabrice Guillet,et al.  A user-driven and quality-oriented visualization for mining association rules , 2003, Third IEEE International Conference on Data Mining.

[29]  Roberto J. Bayardo,et al.  Mining the most interesting rules , 1999, KDD '99.

[30]  Andrea Bellandi,et al.  Ontology-driven Association Rules Extraction: a Case of Study , 2007, C&O:RR.

[31]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm , 2005, IEEE Transactions on Knowledge and Data Engineering.

[32]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[33]  Adriana Santarosa Vivacqua,et al.  From data to knowledge mining , 2009, Artificial Intelligence for Engineering Design, Analysis and Manufacturing.

[34]  Wynne Hsu,et al.  Finding Interesting Patterns Using User Expectations , 1999, IEEE Trans. Knowl. Data Eng..

[35]  Mohammed J. Zaki Mining Non-Redundant Association Rules , 2004, Data Min. Knowl. Discov..

[36]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[37]  Mohammed J. Zaki,et al.  Theoretical Foundations of Association Rules , 2007 .

[38]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[39]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[40]  Abdelaziz Berrado,et al.  Using metarules to organize and group discovered association rules , 2006, Data Mining and Knowledge Discovery.

[41]  Ian Horrocks,et al.  A proposal for an owl rules language , 2004, WWW '04.

[42]  Peter F. Patel-Schneider,et al.  Reducing OWL entailment to description logic satisfiability , 2004, Journal of Web Semantics.

[43]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[44]  Marcos Aurélio Domingues,et al.  Using Taxonomies to Facilitate the Analysis of the Association Rules , 2011, ArXiv.

[45]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[46]  Heikki Mannila,et al.  Pruning and grouping of discovered association rules , 1995 .

[47]  Carola Eschenbach,et al.  Formal Ontology in Information Systems , 2008 .

[48]  Kurt Hornik,et al.  Selective association rule generation , 2008, Comput. Stat..

[49]  B. Shekar,et al.  A relatedness-based data-driven approach to determination of interestingness of association rules , 2005, SAC '05.

[50]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.