Towards Rare Itemset Mining

We describe here a general approach for rare itemset mining. While mining literature has been almost exclusively focused on frequent itemsets, in many practical situations rare ones are of higher interest (e.g., in medical databases, rare combinations of symptoms might provide useful insights for the physicians). Based on an examination of the relevant substructures of the mining space, our approach splits the rare itemset mining task into two steps, i.e., frequent itemset part traversal and rare itemset listing. We propose two algorithms for step one, a naive and an optimized one, respectively, and another algorithm for step two. We also provide some empirical evidence about the performance gains due to the optimized traversal.

[1]  Brian A. Davey,et al.  An Introduction to Lattices and Order , 1989 .

[2]  Ramesh C Agarwal,et al.  Depth first generation of long patterns , 2000, KDD '00.

[3]  Laszlo Szathmary,et al.  Symbolic Data Mining Methods with the Coron Platform. (Méthodes symboliques de fouille de données avec la plate-forme Coron) , 2006 .

[4]  Keun Ho Ryu,et al.  Mining association rules on significant rare data using relative support , 2003, J. Syst. Softw..

[5]  G Siest,et al.  Objectives, Design and Recruitment of a Familial and Longitudinal Cohort for Studying Gene-Environment Interactions in the Field of Cardiovascular Risk: The Stanislas Cohort , 1998, Clinical chemistry and laboratory medicine.

[6]  Claude Berge,et al.  Hypergraphs - combinatorics of finite sets , 1989, North-Holland mathematical library.

[7]  Einoshin Suzuki,et al.  Undirected Discovery of Interesting Exception Rules , 2002, Int. J. Pattern Recognit. Artif. Intell..

[8]  Vladimir Gurvich,et al.  On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices , 2003, Annals of Mathematics and Artificial Intelligence.

[9]  Gerd Stumme,et al.  Mining frequent patterns with counting inference , 2000, SKDD.

[10]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[11]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[12]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[13]  Marzena Kryszkiewicz Concise representation of frequent patterns based on disjunction-free generators , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[14]  Mohammed J. Zaki,et al.  Efficiently mining maximal frequent itemsets , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[15]  Amedeo Napoli,et al.  Vers l'extraction de motifs rares , 2006, EGC.

[16]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[17]  Amedeo Napoli,et al.  CORON: A Framework for Levelwise Itemset Mining Algorithms , 2005 .

[18]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[19]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[20]  Yun Sing Koh,et al.  Finding Sporadic Rules Using Apriori-Inverse , 2005, PAKDD.

[21]  Hongjun Lu,et al.  Efficient Search of Reliable Exceptions , 1999, PAKDD.