论文信息 - Evaluation Measures for Frequent Itemsets Based on Distributed Representations

Evaluation Measures for Frequent Itemsets Based on Distributed Representations

Frequent itemset mining and association rule mining are fundamental problems in data mining. Despite of the intensive and continuous researches on frequent itemset mining, one essential and not completely solved drawback still remains. The drawback is pattern explosion: a huge number of frequent itemsets, or patterns, will be often derived. The association rule mining takes over the same drawback since each association rule consists of two frequent itemsets. One promising solution for alleviating this essential drawback practically as a post-processing is to rank the patterns and rules from various aspects and to identify which one to be examined first. In this paper, we propose to utilize the technology of representation learning to rank frequent itemsets and association rules. More specifically, we develop several evaluation measures for frequent itemsets and association rules by using the vector representations of items, transactions and frequent patterns which are obtained by existing representation learning techniques. By employing vector representations of three kinds of objects, various rankings reflecting different aspects can expect to be obtained.

Tomonobu Ozaki