Classification and Associative Classification Rule Discovery Using Ant Colony Optimization

The primary goal of this research is to investigate the suitability of ant colony optimization, a swarm intelligence based meta-heuristic developed by mimicking some aspects of the food foraging behavior of ants, for building accurate and comprehensible classifiers which can be learned in reasonable time even for large datasets. Towards this end, a novel classification rule discovery algorithm called AntMiner-C and its variants are proposed.Various aspects and parameters of the proposed algorithms are investigated by experimentation on a number of benchmark datasets. Experimental results indicate that the proposed approach builds more accurate models when compared with commonly used classification algorithms.It is also computationally less expensive than previously available ant colony algorithm based classification rules discovery algorithms. A hybrid classifier using ant colony optimization is also proposed that combines association rules mining and supervised classification. Experiments show that the proposed algorithm has the ability to discover high quality rules. Furthermore, it has the advantage that association rules of each class can be mined in parallel if distributed processing is used. Experimental results demonstrate that the proposed hybrid classifier achieves higher accuracy rates when compared with other commonly used classification algorithms. A feature subset selection algorithm is also proposed which is based on ant colony optimization and decision trees.Experiments show that better accuracy is achieved if the subset of features selected by the proposed approach is used instead of full feature set and number of rules is also decreased substantially.

[1]  Karim Faez,et al.  Feature Selection Using Ant Colony Optimization (ACO): A New Method and Comparative Study in the Application of Face Recognition System , 2007, ICDM.

[2]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[3]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[4]  Soo-Hyung Kim,et al.  Segmentation of Brain MR Images Using an Ant Colony Optimization Algorithm , 2009, 2009 Ninth IEEE International Conference on Bioinformatics and BioEngineering.

[5]  Steven Salzberg,et al.  A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[6]  Jason Catlett,et al.  Overprvning Large Decision Trees , 1991, IJCAI.

[7]  Luca Maria Gambardella,et al.  An Ant Colony System Hybridized with a New Local Search for the Sequential Ordering Problem , 2000, INFORMS J. Comput..

[8]  C. Yun-Huoy,et al.  CMARGA: PRUNING DECISION TREE USING GENETIC ALGORITHM IN CLASSIFICATION BASED ON MULTIPLE ASSOCIATION RULES , 2012 .

[9]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[10]  Mahamed G. H. Omran Particle swarm optimization methods for pattern recognition and image processing , 2006 .

[11]  Mohamed Deriche Feature Selection using Ant Colony Optimization , 2009, 2009 6th International Multi-Conference on Systems, Signals and Devices.

[12]  Masahiro Inuiguchi,et al.  Rule Induction Via Clustering Decision Classes , 2006, RSCTC.

[13]  Alex Alves Freitas,et al.  A new version of the ant-miner algorithm discovering unordered rule sets , 2006, GECCO '06.

[14]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[15]  E. Salari,et al.  An ACO algorithm for graph coloring problem , 2005, 2005 ICSC Congress on Computational Intelligence Methods and Applications.

[16]  Byoung-Tak Zhang,et al.  Combining Information-Based Supervised and Unsupervised Feature Selection , 2006, Feature Extraction.

[17]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[18]  Marco Dorigo,et al.  The ant colony optimization meta-heuristic , 1999 .

[19]  K. M. Sim,et al.  Multiple ant-colony optimization for network routing , 2002, First International Symposium on Cyber Worlds, 2002. Proceedings..

[20]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[21]  Alex Alves Freitas,et al.  A hybrid PSO/ACO algorithm for classification , 2007, GECCO '07.

[22]  Huan Liu,et al.  Feature subset selection bias for classification learning , 2006, ICML.

[23]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[24]  Xing Zhang,et al.  A new approach to classification based on association rule mining , 2006, Decis. Support Syst..

[25]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[26]  Ada Wai-Chee Fu,et al.  FP-tree approach for mining N-most interesting itemsets , 2002, SPIE Defense + Commercial Sensing.

[27]  P. Cunningham,et al.  Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection , 2002 .

[28]  J. Ross Quinlan,et al.  Generating Production Rules from Decision Trees , 1987, IJCAI.

[29]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[30]  VYoshinori Yaginuma High-performance Data Mining System , 2001 .

[31]  Ajith Abraham,et al.  Swarm Intelligence in Data Mining , 2009, Swarm Intelligence in Data Mining.

[32]  Zhili Wu,et al.  Feature Selection for Classification using Transductive Support Vector Machines , 2004 .

[33]  Werner Ceusters,et al.  Medical Natural Language Understanding as a Supporting Technology for Data Mining in Healthcare. , 2001 .

[34]  Jia-Ling Koh,et al.  An Efficient Approach for Mining Fault-Tolerant Frequent Patterns Based on Bit Vector Representations , 2005, DASFAA.

[35]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[36]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[37]  Alex A. Freitas,et al.  A survey of evolutionary algorithms for data mining and knowledge discovery , 2003 .

[38]  Thomas Stützle,et al.  Ant colony optimization: artificial ants as a computational intelligence technique , 2006 .

[39]  John S. Usher,et al.  Facility layout using swarm intelligence , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[40]  Pei-Chann Chang,et al.  Database Classification by Integrating a Case-Based Reasoning and Support Vector Machine for Induction , 2010, J. Circuits Syst. Comput..

[41]  Jun Zhang,et al.  Ant Colony System for Optimizing Vehicle Routing Problem with Time Windows , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).

[42]  K. R. Seeja,et al.  An Association Rule Mining Approach for Co-Regulated Signature genes Identification in Cancer , 2009, J. Circuits Syst. Comput..

[43]  Bing Liu,et al.  Classification Using Association Rules: Weaknesses and Enhancements , 2001 .

[44]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[45]  Andries P. Engelbrecht,et al.  Image Classification using Particle Swarm Optimization , 2002, SEAL.

[46]  Alex A. Freitas,et al.  New Results for a Hybrid Decision Tree/Genetic Algorithm for Data Mining , 2004 .

[47]  James Kennedy,et al.  Defining a Standard for Particle Swarm Optimization , 2007, 2007 IEEE Swarm Intelligence Symposium.

[48]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[49]  Jirong Li Feature Selection Based on Correlation between Fuzzy Features and Optimal Fuzzy-Valued Feature Subset Selection , 2008, 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[50]  Remco R. Bouckaert Naive Bayes Classifiers That Perform Well with Continuous Variables , 2004, Australian Conference on Artificial Intelligence.

[51]  Pierre Loonis,et al.  Combination, Cooperation And Selection Of Classifiers: A State Of The Art , 2003, Int. J. Pattern Recognit. Artif. Intell..

[52]  A. Engelbrecht Computational Intelligence: An Introduction, Second Edition , 2007 .

[53]  J. Bezdek,et al.  Generalized k -nearest neighbor rules , 1986 .

[54]  Jean-Louis Deneubourg,et al.  The dynamics of collective sorting robot-like ants and ant-like robots , 1991 .

[55]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[56]  Stephen A. Billings,et al.  Feature Subset Selection and Ranking for Data Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Manuel López-Ibáñez,et al.  Ant colony optimization , 2010, GECCO '10.

[58]  Marco Dorigo,et al.  An Investigation of some Properties of an "Ant Algorithm" , 1992, PPSN.

[59]  Abdul Hanan Abdullah,et al.  An ant colony optimization for dynamic job scheduling in grid environment , 2007 .

[60]  W TsangIvor,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005 .

[61]  Sung-Shun Weng,et al.  Mining time series data for segmentation by using Ant Colony Optimization , 2006, Eur. J. Oper. Res..

[62]  Kyuseok Shim,et al.  PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning , 1998, Data Mining and Knowledge Discovery.

[63]  Antoinette C. Tessmer,et al.  What to Learn from Near Misses: An Inductive Learning Approach to Credit Risk Assessment , 1997 .

[64]  Alex Alves Freitas,et al.  Web Page Classification with an Ant Colony Algorithm , 2004, PPSN.

[65]  WASEEM SHAHZAD,et al.  Compatibility as a Heuristic for Construction of Rules by Artificial Ants , 2010, J. Circuits Syst. Comput..

[66]  Trevor P. Martin,et al.  Feature Subset Selection Using a Fuzzy Method , 2009, 2009 International Conference on Intelligent Human-Machine Systems and Cybernetics.

[67]  Alex Alves Freitas,et al.  Data mining with an ant colony optimization algorithm , 2002, IEEE Trans. Evol. Comput..

[68]  B. Chakraborty Feature subset selection by particle swarm optimization with fuzzy fitness function , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[69]  Bart Baesens,et al.  Ant-Based Approach to the Knowledge Fusion Problem , 2006, ANTS Workshop.

[70]  Thomas A. Runkler,et al.  Fuzzy classification in ant feature selection , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[71]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[72]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[73]  Luca Maria Gambardella,et al.  Ant Algorithms for Discrete Optimization , 1999, Artificial Life.

[74]  Yiming Ma,et al.  Improving an Association Rule Based Classifier , 2000, PKDD.

[75]  Bruce A. Draper,et al.  Feature selection from huge feature sets , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[76]  Shuangcheng Wang,et al.  Feature Subset Selection Based on Bayesian Networks , 2009, 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.

[77]  Tzung-Pei Hong,et al.  An Efficient FUFP-tree Maintenance Algorithm for Record Modification , 2008 .

[78]  Alex A. Freitas,et al.  An ant colony based system for data mining: applications to medical data , 2001 .

[79]  Vittorio Maniezzo,et al.  The Ant System Applied to the Quadratic Assignment Problem , 1999, IEEE Trans. Knowl. Data Eng..

[80]  G. Theraulaz,et al.  Inspiration for optimization from social insect behaviour , 2000, Nature.

[81]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[82]  Vincent S. Tseng,et al.  MINING TEMPORAL RARE UTILITY ITEMSETS IN LARGE DATABASES USING RELATIVE UTILITY THRESHOLDS , 2008 .

[83]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[84]  Bhaskar D. Kulkarni,et al.  An ant colony classifier system: application to some process engineering problems , 2004, Comput. Chem. Eng..

[85]  Tim Oates,et al.  The Effects of Training Set Size on Decision Tree Complexity , 1997, ICML.

[86]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[87]  Mohamed A. Deriche,et al.  A new mutual information based measure for feature selection , 2003, Intell. Data Anal..

[88]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[89]  Santhosh Swaminathan Rule induction using ant colony optimization for mixed variable attributes , 2006 .

[90]  Hussein A. Abbass,et al.  Classification rule discovery with ant colony optimization , 2003, IEEE/WIC International Conference on Intelligent Agent Technology, 2003. IAT 2003..

[91]  Cheng-Jian Lin,et al.  Classification and medical diagnosis using wavelet-based fuzzy neural networks , 2008 .

[92]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[93]  Bo Liu,et al.  Density-Based Heuristic for Rule Discovery with Ant-Miner , 2002 .

[94]  Alex Alves Freitas,et al.  A new ant colony algorithm for multi-label classification with applications in bioinfomatics , 2006, GECCO.

[95]  Michelle Galea,et al.  Simultaneous Ant Colony Optimization Algorithms for Learning Linguistic Fuzzy Rules , 2006, Swarm Intelligence in Data Mining.

[96]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.