Efficient Search of Reliable Exceptions

Finding patterns from data sets is a fundamental task of data mining. If we categorize all patterns into strong, weak, and random, conventional data mining techniques are designed only to find strong patterns, which hold for numerous objects and are usually consistent with the expectations of experts. While such strong patterns are helpful in prediction, the unexpectedness and contradiction exhibited by weak patterns are also very useful although they represent a relatively small number of objects. In this paper, we address the problem of finding weak patterns (i.e., reliable exceptions) from databases. A simple and efficient approach is proposed which uses deviation analysis to identify interesting exceptions and explore reliable ones. Besides, it is flexible in handling both subjective and objective exceptions. We demonstrate the effectiveness of the proposed approach through a set of real-life data sets, and present interesting findings.

[1]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[2]  Balaji Padmanabhan,et al.  A Belief-Driven Method for Discovering Unexpected Patterns , 1998, KDD.

[3]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[4]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[5]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[6]  Gregory Piatetsky-Shapiro,et al.  An Application of KEFM to the Analysis of Healthcare Information , 1994, KDD Workshop.

[7]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[8]  Gregory Piatetsky-Shapiro,et al.  The interestingness of deviations , 1994 .

[9]  Abraham Silberschatz,et al.  On Subjective Measures of Interestingness in Knowledge Discovery , 1995, KDD.

[10]  Einoshin Suzuki,et al.  Autonomous Discovery of Reliable Exception Rules , 1997, KDD.

[11]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[12]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[13]  Brian R. Gaines An Ounce of Knowledge is Worth a Ton of Data: Quantitative studies of the Trade-Off between Expertise and Data Based On Statistically Well-Founded Empirical Induction , 1989, ML.

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..