Using emerging patterns and decision trees in rare-class classification

The problem of classifying rarely occurring cases is faced in many real life applications. The scarcity of the rare cases makes it difficult to classify them correctly using traditional classifiers. In this paper, we propose an approach to use emerging patterns (EPs) (G. Dong and J. Li, 1999) and decision trees (DTs) in rare-class classification (EPDT). EPs are those itemsets whose supports in one class are significantly higher than their supports in the other classes. EPDT employs the power of EPs to improve the quality of rare-case classification. To achieve this aim, we first introduce the idea of generating nonexisting rare-class instances, and then we over-sample the most important rare-class instances. Our experiments show that EPDT outperforms many classification methods.