Evaluation Measures for Extended Association Rules Based on Distributed Representations

Indirect association rules and association action rules are two notable extensions of traditional association rules. Since these two extended rules consist of a pair of association rules, they share the same essential drawback of association rules: a huge number of rules will be derived if the target database to be mined is dense or the minimum threshold is set low. One practical approach for alleviating this essential drawback is to rank the rules to identify which one to be examined first in a post-processing. In this paper, as a new application of representation learning, we propose evaluation measures for indirect association rules and association action rules, respectively. The proposed measures are assessed preliminary using a dataset on Japanese video-sharing site and that on nursery.

[1]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[2]  Sanjeev Arora,et al.  A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.

[3]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[5]  Tomonobu Ozaki,et al.  Extraction of Characteristic Frequent Visual Patterns by Distributed Representation , 2017, 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA).

[6]  Einoshin Suzuki,et al.  Discovering Action Rules That Are Highly Achievable from Massive Data , 2009, PAKDD.

[7]  Jiawei Han,et al.  Re-examination of interestingness measures in pattern mining: a unified framework , 2010, Data Mining and Knowledge Discovery.

[8]  Geoffrey I. Webb,et al.  Statistically sound pattern discovery , 2014, KDD.

[9]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[10]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[11]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[12]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jaideep Srivastava,et al.  Indirect Association: Mining Higher Order Dependencies in Data , 2000, PKDD.

[14]  Geoffrey I. Webb Discovering significant patterns , 2008, Machine Learning.

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[17]  Tomonobu Ozaki Evaluation Measures for Frequent Itemsets Based on Distributed Representations , 2018, 2018 Sixth International Symposium on Computing and Networking (CANDAR).

[18]  Patrick Meyer,et al.  Association Rule Interestingness Measures: Experimental and Theoretical Studies , 2007, Quality Measures in Data Mining.

[19]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[20]  Jean-François Boulicaut,et al.  Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries , 2004, Data Mining and Knowledge Discovery.

[21]  Zbigniew W. Ras,et al.  Association Action Rules , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[22]  Tomonobu Ozaki,et al.  Experimental Study of Characterizing Frequent Itemsets Using Representation Learning , 2018, 2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA).