Recently, data mining has been applied in business information and intelligence systems for discovering interesting patterns and knowledge to support decision making processes. One of the most basic and important tasks of data mining is the mining of frequent itemsets, which are sets of items frequently purchased by customers. Many methods have been proposed for this problem. However, mining the complete set of frequent itemsets often leads to a huge solution space. Fortunately, this problem can be reduced to the mining of Frequent Closed Itemsets (FCIs), which results in a much smaller yet representative set of purchase patterns of the customers. Still, there are redundancies in the databases that can be eliminated to enhance both space and time efficiency. In this paper, we propose a novel data structure, the Transaction Pattern List (TPL), for eliminating data redundancies, and design the algorithm TPLFCI-Mining for mining FCIs efficiently with the TPL. Our algorithm is evaluated under more rigorous conditions than previously proposed methods. Experimental results show that our method is efficient for both sparse and dense databases, and is scalable for large databases even at low support thresholds.
[1]
Henry Chen,et al.
Mining frequent closed itemsets with the frequent pattern list
,
2001,
Proceedings 2001 IEEE International Conference on Data Mining.
[2]
Ron Kohavi,et al.
Real world performance of association rule algorithms
,
2001,
KDD '01.
[3]
Jian Pei,et al.
Mining frequent patterns without candidate generation
,
2000,
SIGMOD '00.
[4]
Raymond Grosch.
The digital nervous system
,
2003
.
[5]
Jian Pei,et al.
CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets
,
2000,
ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.
[6]
Ching-Chi Hsu,et al.
Generating Frequent Patterns with the Frequent Pattern List
,
2001,
PAKDD.
[7]
Nicolas Pasquier,et al.
Discovering Frequent Closed Itemsets for Association Rules
,
1999,
ICDT.
[8]
Mohammed J. Zaki,et al.
CHARM: An Efficient Algorithm for Closed Association Rule Mining
,
2007
.