Improved classification based on predictive association rules

Classification based on predictive association rules (CPAR) is a kind of association classification methods which combines the advantages of both associative classification and traditional rule-based classification. For rule generation, CPAR is more efficient than traditional rule-based classification because much repeated calculation is avoided and multiple literals can be selected to generate multiple rules simultaneously. Despite these advantages above in rule generation, the prediction processes have the weaknesses of class rule distribution imbalance and interruption of incorrect class rules. Further, it is useless to instances satisfying no rules. To tackle these problems, this paper presents Class Weighting Adjustment, Center Vector-based Pre-classification and Post-processing with Support Vector Machine. Experiments on Chinese text classification corpus TanCorp show that our algorithm achieves an average improvement of 5.91% on F1 score compared with CPAR.