Knowledge and Information Systems

Building fast and accurate classifiers for large-scale databases is an important task in data mining. There is growing evidence that integrating classification and association rule mining can produce more efficient and accurate classifiers than traditional techniques. In this paper, the problem of producing rules with multiple labels is investigated, and we propose a multi-class, multi-label associative classification approach (MMAC). In addition, four measures are presented in this paper for evaluating the accuracy of classification approaches to a wide range of traditional and multi-label classification problems. Results for 19 different data sets from the UCI data collection and nine hyperheuristic scheduling runs show that the proposed approach is an accurate and effective classification technique, highly competitive and scalable if compared with other traditional and associative classification approaches.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[3]  Osmar R. Zaïane,et al.  An associative classifier based on positive and negative rules , 2004, DMKD '04.

[4]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[5]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[6]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[7]  Peter I. Cowling,et al.  MCAR: multi-class classification based on association rule , 2005, The 3rd ACS/IEEE International Conference onComputer Systems and Applications, 2005..

[8]  Kamal Ali,et al.  Partial Classification Using Association Rules , 1997, KDD.

[9]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[10]  Hong Shen,et al.  Mining Optimal Class Association Rule Set , 2001, PAKDD.

[11]  Pang-Ning Tan,et al.  Interestingness Measures for Association Patterns : A Perspective , 2000, KDD 2000.

[12]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[13]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[14]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[15]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[16]  Peter I. Cowling,et al.  Hyperheuristics for managing a large collection of low level heuristics to schedule personnel , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[17]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[18]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[19]  Hong Shen,et al.  Construct robust rule sets for classification , 2002, KDD.

[20]  J. Ross Quinlan,et al.  Generating Production Rules from Decision Trees , 1987, IJCAI.

[21]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[22]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[23]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[24]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[25]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[26]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[27]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[28]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[29]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[30]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[31]  Jiebo Luo,et al.  Multi-label Semantic Scene Classfication , 2003 .

[32]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[33]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[34]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[35]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[36]  Wei-Yin Loh,et al.  A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms , 2000, Machine Learning.