Interestingness Measures for Association Patterns : A Perspective

ABSTRACT Asso iation rules are valuable patterns be ause they o er useful insight into the types of dependen ies that exist between attributes of a data set. Due to the ompleteness nature of algorithms su h as Apriori, the number of patterns extra ted are often very large. Therefore, there is a need to prune or rank the dis overed patterns a ording to their degree of interestingness. In this paper, we will examine the various interestingness measures proposed in statisti s, mahine learning and data mining literature. We will ompare these measures and investigate how lose they re e t the statisti al notion of orrelation. We will show that supportbased pruning, whi h is often used in asso iation rule disovery, is appropriate be ause it removes mostly un orrelated and negatively orrelated patterns. Our experimental results veri ed that many of the intuitive measures (su h as Piatetsky-Shapiro's rule-interest, on den e, lapla e, entropy gain, et .) are very similar in nature to orrelation oeÆ ient (in the region of support values typi ally en ountered in pra ti e). Finally, we will introdu e a new metri , alled the IS measure, and show that it is highly linear with respe t to orrelation oeÆ ient for many interesting assoiation patterns.