Applying Bagging Techniques to the SA Tabu Miner Rule Induction Algorithm

This paper presents an implementation of bagging techniques over the heuristic algorithm for induction of classification rules called SA Tabu Miner (Simulated Annealing and Tabu Search data miner). The goal was to achieve better predictive accuracy of the derived classification rules. Bagging (Bootstrap aggregating) is an ensemble method that has attracted a lot of attention, both experimentally, since it behaves well on noisy datasets, and theoretically, because of its simplicity. In this paper we present the experimental results of various bagging versions of the SA Tabu Miner algorithm. The SA Tabu Miner algorithm is inspired by both research on heuristic optimization algorithms and rule induction data mining concepts and principles. Several bootstrap methodologies were applied to SA Tabu Miner, including reducing repetition of instances, forcing repetition of instances not to exceed two, using different percentages of the original basic training set. Various experimental approaches and parameters yielded different results on the compared datasets.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  Ivan Chorbev,et al.  Web Based Medical Expert System with a Self Training Heuristic Rule Induction Algorithm , 2009, 2009 First International Confernce on Advances in Databases, Knowledge, and Data Applications.

[3]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[4]  A. Buja,et al.  OBSERVATIONS ON BAGGING , 2006 .

[5]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  P. Hall,et al.  Properties of bagged nearest neighbour classifiers , 2005 .

[8]  OpitzDavid,et al.  Popular ensemble methods , 1999 .

[9]  Sholom M. Weiss,et al.  Small Sample Error Rate Estimation for k-NN Classifiers , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[11]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[12]  Paul W. Munro,et al.  Improving Committee Diagnosis with Resampling Techniques , 1995, NIPS.

[13]  Yves Grandvalet,et al.  Bagging Equalizes Influence , 2004, Machine Learning.

[14]  Hongbin Zhang,et al.  Feature selection using tabu search method , 2002, Pattern Recognit..

[15]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[16]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[17]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[18]  Esther-Lydia Silva-Ramírez,et al.  Bagging Classification Models with Reduced Bootstrap , 2004, SSPR/SPR.

[19]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[20]  Berkman Sahiner,et al.  Dual system approach to computer-aided detection of breast masses on mammograms. , 2006, Medical physics.

[21]  SwitzerlandBin YuBell Explaining Bagging , 2000 .

[22]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Paul W. Munro,et al.  Reducing Variance of Committee Prediction with Resampling Techniques , 1996, Connect. Sci..

[24]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[25]  J. Friedman Stochastic gradient boosting , 2002 .

[26]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[27]  David W. Opitz,et al.  Actively Searching for an E(cid:11)ective Neural-Network Ensemble , 1996 .

[28]  David W. Opitz,et al.  Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[29]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[30]  Jerzy Stefanowski,et al.  Bagging and Induction of Decision Rules , 2002, Intelligent Information Systems.