Using ontologies to facilitate post-processing of association rules by domain experts

Data mining is used to discover hidden patterns or structures in large databases. Association rule induction extracts frequently occurring patterns in the form of association rules. However, this technique has a drawback as it typically generates a large number of association rules. Several methods have been proposed to prune the set of extracted rules in order to present only those which are of interest to the domain experts. Some of these methods involve subjective analysis based on prior domain knowledge, while others can be considered to involve objective, data-driven analysis based on numerical measures that provide a partial description of the interestingness of the extracted association rules. Recently it has been proposed that ontologies could be used to guide the data mining process. In this paper, we propose a hybrid pruning method that involve the use of objective analysis and subjective analysis, with the latter involving the use of an ontology. We demonstrate the applicability of this hybrid method using a medical database.

[1]  Feng-Hsu Wang,et al.  On discovery of soft associations with "most" fuzzy quantifier for item promotion applications , 2008, Inf. Sci..

[2]  Nicola Guarino,et al.  Formal ontology, conceptual analysis and knowledge representation , 1995, Int. J. Hum. Comput. Stud..

[3]  Isamu Shioya,et al.  Knowledge pruning in decision trees , 2000, Proceedings 12th IEEE Internationals Conference on Tools with Artificial Intelligence. ICTAI 2000.

[4]  Carole D. Hafner,et al.  The State of the Art in Ontology Design: A Comparative Review , 1997 .

[5]  Geert Wets,et al.  Defining interestingness for association rules , 2003 .

[6]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[7]  N. Guarino,et al.  Formal Ontology in Information Systems : Proceedings of the First International Conference(FOIS'98), June 6-8, Trento, Italy , 1998 .

[8]  Gregory Piatetsky-Shapiro,et al.  The interestingness of deviations , 1994 .

[9]  Francesco M. Donini,et al.  Description Logic-Based Resource Retrieval , 2011, Encyclopedia of Knowledge Management.

[10]  Jiawei Han,et al.  TFP: an efficient algorithm for mining top-k frequent closed itemsets , 2005, IEEE Transactions on Knowledge and Data Engineering.

[11]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[12]  Carole D. Hafner,et al.  The State of the Art in Ontology Design: A Survey and Comparative Review , 1997, AI Mag..

[13]  Dieter Fensel,et al.  Ontologies: A silver bullet for knowledge management and electronic commerce , 2002 .

[14]  Alex Alves Freitas,et al.  On rule interestingness measures , 1999, Knowl. Based Syst..

[15]  Alexander Borgida,et al.  Description Logics in Data Management , 1995, IEEE Trans. Knowl. Data Eng..

[16]  Ah-Hwee Tan,et al.  Learning and inferencing in user ontology for personalized Semantic Web search , 2009, Inf. Sci..

[17]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[18]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[19]  D. Schwartz Encyclopedia of Knowledge Management , 2005 .

[20]  Nicole J. J. P. Koenderink,et al.  Supporting knowledge-intensive inspection tasks with application ontologies , 2006, Int. J. Hum. Comput. Stud..

[21]  Nagwa M. El-Makky,et al.  A note on "beyond market baskets: generalizing association rules to correlations" , 2000, SKDD.

[22]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[23]  Liz Sonenberg,et al.  Domain ontology driven data mining: a medical case study , 2007, DDDM '07.

[24]  Nicolás Marín,et al.  TBAR: An efficient method for association rule mining in relational databases , 2001, Data Knowl. Eng..

[25]  Wynne Hsu,et al.  Analyzing the Subjective Interestingness of Association Rules , 2000, IEEE Intell. Syst..

[26]  Kweku-Muata Osei-Bryson,et al.  Toward an integrated knowledge discovery and data mining process model , 2010, The Knowledge Engineering Review.

[27]  Ling Cheng,et al.  New algorithms for efficient mining of association rules , 1999, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[28]  Jan Rauch,et al.  Roles of Medical Ontology in Association Mining CRISP-DM Cycle , 2004 .

[29]  Alexander Tuzhilin A Pattern Discovery Algebra , 1997, DMKD.

[30]  Yuzhong Qu A Predicate-Ordered Sort-Ordered Logic for RDFS , 2003, WWW.

[31]  Anthony G. Cohn,et al.  A more expressive formulation of many sorted logic , 1987, Journal of Automated Reasoning.

[32]  Balaji Padmanabhan,et al.  Small is beautiful: discovering the minimal set of unexpected patterns , 2000, KDD '00.

[33]  Kweku-Muata Osei-Bryson,et al.  Organization-Ontology Based Framework for Implementing the Business Understanding Phase of Data Mining Projects , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[34]  Hong-Gee Kim,et al.  An ontology-based approach to learnable focused crawling , 2008, Inf. Sci..

[35]  Ada Wai-Chee Fu,et al.  Mining frequent itemsets without support threshold: with and without item constraints , 2004, IEEE Transactions on Knowledge and Data Engineering.

[36]  Yuzhong Qu,et al.  A predicate-ordered logic for knowledge representation on the web , 2004, Future Gener. Comput. Syst..

[37]  Li Shen,et al.  New Algorithms for Efficient Mining of Association Rules , 1999, Inf. Sci..

[38]  Nicola Guarino,et al.  Formal Ontology and Information Systems , 1998 .

[39]  Christoph Walther A Mechanical Solution of Schubert's Steamroller by Many-Sorted Resolution , 1984, AAAI.

[40]  Anna Formica,et al.  Ontology-based concept similarity in Formal Concept Analysis , 2006, Inf. Sci..

[41]  Ting Yu,et al.  Incorporating Prior Domain Knowledge into , 2007 .

[42]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[43]  Ioannis N. Kouris,et al.  Automatic discovery of locally frequent itemsets in the presence of highly frequent itemsets , 2005, Intell. Data Anal..

[44]  Jack W. Smith,et al.  Ontology Driven Construction of a Knowledgebase for Bayesian Decision Models Based on UMLS , 2005, MIE.

[45]  H M Kim,et al.  An Ontology for Quality Management — Enabling Quality Problem Identification and Tracing , 1999 .

[46]  Peter F. Patel-Schneider,et al.  Reducing OWL entailment to description logic satisfiability , 2004, Journal of Web Semantics.

[47]  Lien Fu Lai,et al.  A knowledge engineering approach to knowledge management , 2007, Inf. Sci..

[48]  Olivier Bodenreider,et al.  Mapping the UMLS Semantic Network into general ontologies , 2001, AMIA.

[49]  Young-Koo Lee,et al.  Efficient single-pass frequent pattern mining using a prefix-tree , 2009, Inf. Sci..

[50]  Jaideep Srivastava,et al.  Selecting the right objective measure for association analysis , 2004, Inf. Syst..

[51]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[52]  Riichiro Mizoguchi,et al.  Ontological Knowledge Base Reasoning with Sort-Hierarchy and Rigidity , 2004, KR.

[53]  John Mylopoulos,et al.  Ontologies for Knowledge Management: An Information Systems Perspective , 2004, Knowledge and Information Systems.

[54]  Hongjun Lu,et al.  Exception Rule Mining with a Relative Interestingness Measure , 2000, PAKDD.

[55]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[56]  Bruce G. Buchanan,et al.  Ontology-guided knowledge discovery in databases , 2001, K-CAP '01.