Adjusting and generalizing CBA algorithm to handling class imbalance

Associative classification has attracted substantial interest in recent years and been shown to yield good results. However, research in this field tends to focus on the development of class classifiers, but the required probability classifier of imbalance data has not been addressed comprehensively. This investigation presents a new associative classification method called Probabilistic Classification based on Association Rules (PCAR). PCAR is based on modifying the rule sorting index, the pruning method, and the scoring procedure in the CBA algorithm. CBA can be generalized to construct a probability classifier. Additionally, it can improve the efficiency of associative classification for predicting imbalance data. Experiments that use both benchmarking datasets and real-life application datasets reveal that the new method outperforms the previous associative classification algorithm and C5.0 for all datasets. Also, in some datasets, the predictive performance exceeds that achieved by logistic regression and the use of a neural network.

[1]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[2]  Wen-Chin Chen,et al.  Increasing the effectiveness of associative classification in terms of class imbalance by using a novel pruning algorithm , 2012, Expert Syst. Appl..

[3]  Dirk Van den Poel,et al.  Handling class imbalance in customer churn prediction , 2009, Expert Syst. Appl..

[4]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[5]  Davy Janssens,et al.  Improving associative classification by incorporating novel interestingness measures , 2005, IEEE International Conference on e-Business Engineering (ICEBE'05).

[6]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[7]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[8]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[9]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[10]  Salvatore J. Stolfo,et al.  Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.

[11]  William Nick Street,et al.  An intelligent system for customer targeting: a data mining approach , 2004, Decis. Support Syst..

[12]  Davy Janssens,et al.  Adapting the CBA algorithm by means of intensity of implication , 2005, Inf. Sci..

[13]  Philip S. Yu,et al.  Scoring the Data Using Association Rules , 2003, Applied Intelligence.

[14]  Dominique M. Hanssens,et al.  Modeling Customer Lifetime Value , 2006 .

[15]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[16]  Jinyan Li,et al.  CAEP: Classification by Aggregating Emerging Patterns , 1999, Discovery Science.

[17]  Gary Geunbae Lee,et al.  Efficient implementation of associative classifiers for document classification , 2007, Inf. Process. Manag..

[18]  M. H. Margahny,et al.  FAST ALGORITHM FOR MINING ASSOCIATION RULES , 2014 .

[19]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[20]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[21]  Yuan-Chun Jiang,et al.  Maximizing customer satisfaction through an online recommendation system: A novel associative classification model , 2010, Decis. Support Syst..

[22]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[23]  Ke Wang,et al.  Growing decision trees on support-less association rules , 2000, KDD '00.

[24]  Wen-Chin Chen,et al.  Optimal selection of potential customer range through the union sequential pattern by using a response model , 2011, Expert Syst. Appl..