Emerging Patterns (EPs) are a data mining model that is useful as a means of discovering distinctions inherently present amongst a collection of datasets. However, current EP mining algorithms do not handle attributes whose values are asscociated with taxonomies (is-a hierarchies). Current EP mining techniques are restricted to using only the leaf-level attribute-values in a taxonomy. In this paper, we formally introduce the problem of mining generalised emerging patterns. Given a large data set, where some attributes are hierarchical, we find emerging patterns that consist of items at any level of the taxonomies. Generalised EPs are more concise and interpretable when used to describe some distinctive characteristics of a class of data. They are also considered to be more expressive because they include items at higher levels of the hierarchies, which have larger supports than items at the leaf level. We formulate the problem of mining generalised EPs, and present an algorithm for this task. We demonstrate that the discovered generalised patterns, which contain items at higher levels in the hierarchies, have greater support than traditional leaf-level EPs according to our experimental results based on ten benchmark datasets.
[1]
James Bailey,et al.
Fast Algorithms for Mining Emerging Patterns
,
2002,
PKDD.
[2]
Xindong Wu,et al.
Efficient mining of both positive and negative association rules
,
2004,
TOIS.
[3]
Xiuzhen Zhang,et al.
Discovering Jumping Emerging Patterns and Experiments on Real Datasets
,
1999
.
[4]
Geoffrey I. Webb.
Efficient search for association rules
,
2000,
KDD '00.
[5]
Jinyan Li,et al.
Efficient mining of emerging patterns: discovering trends and differences
,
1999,
KDD '99.
[6]
Ramakrishnan Srikant,et al.
Mining generalized association rules
,
1995,
Future Gener. Comput. Syst..
[7]
Rüdiger Wirth,et al.
A New Algorithm for Faster Mining of Generalized Association Rules
,
1998,
PKDD.
[8]
Jan Komorowski,et al.
Principles of Data Mining and Knowledge Discovery
,
2001,
Lecture Notes in Computer Science.
[9]
Tomasz Imielinski,et al.
Mining association rules between sets of items in large databases
,
1993,
SIGMOD Conference.