Integrating Classification and Association Rule Mining

Classification rule mining aims to discover a small set of rules in the database that forms an accurate classifier. Association rule mining finds all the rules existing in the database that satisfy some minimum support and minimum confidence constraints. For association rule mining, the target of discovery is not pre-determined, while for classification rule mining there is one and only one predetermined target. In this paper, we propose to integrate these two mining techniques. The integration is done by focusing on mining a special subset of association rules, called class association rules (CARs). An efficient algorithm is also given for building a classifier based on the set of discovered CARs. Experimental results show that the classifier built this way is, in general, more accurate than that produced by the state-of-the-art classification system C4.5. In addition, this integration helps to solve a number of problems that exist in the current classification systems.

[1]  Ryszard S. Michalski,et al.  Pattern Recognition as Rule-Guided Inductive Inference , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[3]  Stan Matwin,et al.  Using Qualitative Models to Guide Inductive Learning , 1993, ICML.

[4]  Jeffrey C. Schlimmer,et al.  Efficiently Inducing Determinations: A Complete and Systematic Search Algorithm that Uses Optimal Pruning , 1993, ICML.

[5]  Michael J. Pazzani,et al.  Exploring the Decision Forest: An Empirical Investigation of Occam's Razor in Decision Tree Induction , 1993, J. Artif. Intell. Res..

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  Ron Kohavi,et al.  MLC++: a machine learning library in C++ , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[8]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[9]  R. Mike Cameron-Jones,et al.  Oversearching and Layered Search in Empirical Learning , 1995, IJCAI.

[10]  Nils J. Nilsson,et al.  MLC++, A Machine Learning Library in C++. , 1995 .

[11]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[12]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[13]  Wynne Hsu,et al.  Post-Analysis of Learned Rules , 1996, AAAI/IAAI, Vol. 1.

[14]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[15]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[16]  Kamal Ali,et al.  Partial Classification Using Association Rules , 1997, KDD.

[17]  Roberto J. Bayardo Brute-Force Mining of High-Confidence Classification Rules , 1997, KDD.

[18]  Michael J. Pazzani,et al.  Beyond Concise and Colorful: Learning Intelligible Rules , 1997, KDD.

[19]  Wynne Hsu,et al.  Using General Impressions to Analyze Discovered Classification Rules , 1997, KDD.

[20]  Yasuhiko Morimoto,et al.  Computing Optimized Rectilinear Regions for Association Rules , 1997, KDD.

[21]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[22]  Ke Wang,et al.  Interestingness-Based Interval Merger for Numeric Association Rules , 1998, KDD.

[23]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .