Exploiting Maximal Emerging Patterns for Classification

Classification is an important data mining problem Emerging Patterns (EPs) are itemsets whose supports change significantly from one data class to another Previous studies have shown that classifiers based on EPs are competitive to other state-of-the-art classification systems In this paper, we propose a new type of Emerging Patterns, called Maximal Emerging Patterns (MaxEPs), which are the longest EPs satisfying certain constraints MaxEPs can be used to condense the vast amount of information, resulting in a significantly smaller set of high quality patterns for classification We also develop a new “overlapping” or “intersection” based mechanism to exploit the properties of MaxEPs Our new classifier, Classification by Maximal Emerging Patterns (CMaxEP), combines the advantages of the Bayesian approach and EP-based classifiers The experimental results on 36 benchmark datasets from the UCI machine learning repository demonstrate that our method has better overall classification accuracy in comparison to JEP-classifier, CBA, C5.0 and NB.