Rare Association Rule Mining via Transaction Clustering

Rare association rule mining has received a great deal of attention in the recent past. In this research, we use transaction clustering as a pre-processing mechanism to generate rare association rules. The basic concept underlying transaction clustering stems from the concept of large items as defined by traditional association rule mining algorithms. We make use of an approach proposed by Koh & Pears (2008) to cluster transactions prior to mining for association rules. We show that pre-processing the dataset by clustering will enable each cluster to express their own associations without interference or contamination from other sub groupings that have different patterns of relationships. Our results show that the rare rules produced by each cluster are more informative than rules found from direct association rule mining on the unpartitioned dataset.

[1]  Yun Sing Koh,et al.  Finding Sporadic Rules Using Apriori-Inverse , 2005, PAKDD.

[2]  Johannes Gehrke,et al.  CACTUS—clustering categorical data using summaries , 1999, KDD '99.

[3]  Hui Xiong,et al.  A New Clustering Algorithm for Transaction Data via Caucus , 2003, PAKDD.

[4]  Ming-Syan Chen,et al.  An efficient clustering algorithm for market basket data based on small large ratios , 2001, 25th Annual International Computer Software and Applications Conference. COMPSAC 2001.

[5]  Yun Sing Koh,et al.  Transaction Clustering Using a Seeds Based Approach , 2008, PAKDD.

[6]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[7]  Keun Ho Ryu,et al.  Mining association rules on significant rare data using relative support , 2003, J. Syst. Softw..

[8]  Yun Sing Koh,et al.  Mining Interesting Imperfectly Sporadic Rules , 2006, PAKDD.

[9]  Ke Wang,et al.  Clustering transactions using large items , 1999, CIKM '99.

[10]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[12]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[13]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[14]  Alexandre Villeminot,et al.  Combined use of association rules mining and clustering methods to find relevant links between binary rare attributes in a large data set , 2007, Comput. Stat. Data Anal..

[15]  David R. Karger,et al.  Scatter/Gather: a cluster-based approach to browsing large document collections , 1992, SIGIR '92.