Exceptional Knowledge Discovery in Databases Based on Information Theory

This paper presents an algorithm for discovering exceptional knowledge from databases. Exceptional knowledge, which is defined as an exception to a general fact, exhibits unexpectedness and is sometimes extremely useful in spite of its obscurity Previous discovery approaches for this type of knowledge employ either background knowledge or domain-specific criteria for evaluating the possible usefulness, i.e. the interestingness of the knowledge extracted from a database It has been pointed out, however, that these approaches are prone to overlook useful knowledge. In order to circumvent these difficulties, we propose an information-theoretic approach in which we obtain exceptional knowledge associated with general knowledge in the form of a rule pair using a depth-first search method. The product of the ACEs (Average Compressed Entropies) of the rule pair is introduced as the criterion for evaluating the interestingness of exceptional knowledge. The inefficiency of depth-first search is alleviated by a branch-and-bound method, which exploits the upper-bound for the product of the ACEs. MEPRO, which is a knowledge discovery system based on our approach, has been validated using the benchmark databases in the machine learning community.

[1]  Gregory Piatetsky-Shapiro,et al.  The interestingness of deviations , 1994 .

[2]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[3]  Padhraic Smyth,et al.  Rule Induction Using Information Theory , 1991, Knowledge Discovery in Databases.

[4]  Willi Klösgen,et al.  A Support System for Interpreting Statistical Data , 1991, Knowledge Discovery in Databases.