Editorial survey: swarm intelligence for data mining

This paper surveys the intersection of two fascinating and increasingly popular domains: swarm intelligence and data mining. Whereas data mining has been a popular academic topic for decades, swarm intelligence is a relatively new subfield of artificial intelligence which studies the emergent collective intelligence of groups of simple agents. It is based on social behavior that can be observed in nature, such as ant colonies, flocks of birds, fish schools and bee hives, where a number of individuals with limited capabilities are able to come to intelligent solutions for complex problems. In recent years the swarm intelligence paradigm has received widespread attention in research, mainly as Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO). These are also the most popular swarm intelligence metaheuristics for data mining. In addition to an overview of these nature inspired computing methodologies, we discuss popular data mining techniques based on these principles and schematically list the main differences in our literature tables. Further, we provide a unifying framework that categorizes the swarm intelligence based data mining algorithms into two approaches: effective search and data organizing. Finally, we list interesting issues for future research, hereby identifying methodological gaps in current research as well as mapping opportunities provided by swarm intelligence to current challenges within data mining research.

[1]  Ivanoe De Falco,et al.  Facing classification problems with Particle Swarm Optimization , 2007, Appl. Soft Comput..

[2]  Lale Özbakir,et al.  TACO-miner: An ant colony based algorithm for rule extraction from trained neural networks , 2009, Expert Syst. Appl..

[3]  Shengrui Wang,et al.  Particle swarm optimizer for variable weighting in clustering high-dimensional data , 2009, SIS.

[4]  Marco Dorigo,et al.  Swarm intelligence: from natural to artificial systems , 1999 .

[5]  Benjamin King Step-Wise Clustering Procedures , 1967 .

[6]  Alex Alves Freitas,et al.  A hybrid PSO/ACO algorithm for classification , 2007, GECCO '07.

[7]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[8]  Taku Komura,et al.  Topology matching for fully automatic similarity estimation of 3D shapes , 2001, SIGGRAPH.

[9]  Johannes Fürnkranz,et al.  On the quest for optimal rule learning heuristics , 2010, Machine Learning.

[10]  Alex Alves Freitas,et al.  Revisiting the Foundations of Artificial Immune Systems for Data Mining , 2007, IEEE Transactions on Evolutionary Computation.

[11]  Ana L. C. Bazzan,et al.  Traffic Lights Control with Adaptive Group Formation Based on Swarm Intelligence , 2006, ANTS Workshop.

[12]  Yongling Zheng,et al.  On the convergence analysis and parameter selection in particle swarm optimization , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[13]  Marco Dorigo,et al.  Ant-Based Clustering and Topographic Mapping , 2006, Artificial Life.

[14]  Thomas Stützle,et al.  Ant Colony Optimization and Swarm Intelligence: 5th International Workshop, ANTS 2006, Brussels, Belgium, September 4-7, 2006, Proceedings (Lecture Notes in Computer Science) , 2006 .

[15]  Jean-Louis Deneubourg,et al.  The dynamics of collective sorting robot-like ants and ant-like robots , 1991 .

[16]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[17]  Jing J. Liang,et al.  Comprehensive learning particle swarm optimizer for global optimization of multimodal functions , 2006, IEEE Transactions on Evolutionary Computation.

[18]  Qiang Shen,et al.  Learning Bayesian Network Equivalence Classes with Ant Colony Optimization , 2009, J. Artif. Intell. Res..

[19]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1998 .

[20]  Alex A. Freitas,et al.  An ant colony based system for data mining: applications to medical data , 2001 .

[21]  Bart Baesens,et al.  Decompositional Rule Extraction from Support Vector Machines by Active Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[22]  Jorge Pinho de Sousa,et al.  Metaheuristics: Computer Decision-Making , 2010 .

[23]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[24]  Bo Liu,et al.  Density-Based Heuristic for Rule Discovery with Ant-Miner , 2002 .

[25]  Erhan Akin,et al.  Multi-objective rule mining using a chaotic particle swarm optimization algorithm , 2009, Knowl. Based Syst..

[26]  Carlos A. Coello Coello,et al.  Swarm Intelligence for Multi-objective Problems in Data Mining , 2009 .

[27]  Marco Dorigo,et al.  AntNet: Distributed Stigmergetic Control for Communications Networks , 1998, J. Artif. Intell. Res..

[28]  Krzysztof Socha,et al.  ACO for Continuous and Mixed-Variable Optimization , 2004, ANTS Workshop.

[29]  J. Schwartz,et al.  Theory of Self-Reproducing Automata , 1967 .

[30]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[31]  Carlos Soares,et al.  Is the UCI Repository Useful for Data Mining? , 2003, EPIA.

[32]  Holger H. Hoos,et al.  Improving the Ant System: A Detailed Report on the MAX-MIN Ant System , 1996 .

[33]  Marco Dorigo,et al.  On the Performance of Ant-based Clustering , 2003, HIS.

[34]  Richard F. Hartl,et al.  Applying the ANT System to the Vehicle Routing Problem , 1999 .

[35]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[36]  Baldo Faieta,et al.  Diversity and adaptation in populations of clustering ants , 1994 .

[37]  Alex A. Freitas,et al.  A survey of evolutionary algorithms for data mining and knowledge discovery , 2003 .

[38]  E. D. Taillard,et al.  Ant Systems , 1999 .

[39]  Harukazu Igarashi,et al.  Design and Application of Hybrid Intelligent Systems , 2003 .

[40]  Thomas Stützle,et al.  MAX-MIN Ant System , 2000, Future Gener. Comput. Syst..

[41]  Marco Dorigo,et al.  Swarm-Bots and Swarmanoid: Two Experiments in Embodied Swarm Intelligence , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[42]  Luis Felipe Giraldo,et al.  Foraging theory for dimensionality reduction of clustered data , 2009, Machine Learning.

[43]  Said Salhi,et al.  An ant system algorithm for the mixed vehicle routing problem with backhauls , 2004 .

[44]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[45]  Shigeyoshi Tsutsui,et al.  Advances in evolutionary computing: theory and applications , 2003 .

[46]  Marco Dorigo,et al.  Ant system for Job-shop Scheduling , 1994 .

[47]  Chris Cornelis,et al.  Fuzzy Ant Based Clustering , 2004, ANTS Workshop.

[48]  Julia Handl,et al.  Improved Ant-Based Clustering and Sorting , 2002, PPSN.

[49]  B. Alatas,et al.  Chaos embedded particle swarm optimization algorithms , 2009 .

[50]  Hossein Nezamabadi-pour,et al.  GSA: A Gravitational Search Algorithm , 2009, Inf. Sci..

[51]  José Neves,et al.  The fully informed particle swarm: simpler, maybe better , 2004, IEEE Transactions on Evolutionary Computation.

[52]  Marco Dorigo,et al.  Ant colony optimization for continuous domains , 2008, Eur. J. Oper. Res..

[53]  C. Cotta On the Learning of Bayesian Network Graph Structures via Evolutionary Programming , 2004 .

[54]  Thomas Stützle,et al.  Ant Colony Optimization Theory , 2004 .

[55]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[56]  Francisco Herrera,et al.  Analysis of the Best-Worst Ant System and Its Variants on the QAP , 2002, Ant Algorithms.

[57]  Shengrui Wang,et al.  Particle swarm optimizer for variable weighting in clustering high-dimensional data , 2009, 2009 IEEE Swarm Intelligence Symposium.

[58]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[59]  Alex Alves Freitas,et al.  A hybrid particle swarm/ant colony algorithm for the classification of hierarchical biological data , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[60]  Alex A. Freitas,et al.  A hybrid PSO/ACO algorithm for discovering classification rules in data mining , 2008 .

[61]  Chris Volinsky,et al.  Network-Based Marketing: Identifying Likely Adopters Via Consumer Networks , 2006, math/0606278.

[62]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[63]  Dimitrios Gunopulos,et al.  Locally adaptive metrics for clustering high dimensional data , 2007, Data Mining and Knowledge Discovery.

[64]  Alex Alves Freitas,et al.  Data mining with an ant colony optimization algorithm , 2002, IEEE Trans. Evol. Comput..

[65]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[66]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[67]  Salvatore J. Stolfo,et al.  Real time data mining-based intrusion detection , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[68]  Zhongzhi Shi,et al.  Swarm Intelligence Clustering Algorithm Based on Attractor , 2005, ICNC.

[69]  Thomas Stützle,et al.  Ant Colony Optimization and Swarm Intelligence: 4th International Workshop , ANTS 2004. Proceedings , 2004 .

[70]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[71]  Luca Maria Gambardella,et al.  Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem , 1995, ICML.

[72]  Silvano Martello,et al.  Meta-Heuristics: Advances and Trends in Local Search Paradigms for Optimization , 2012 .

[73]  Cheng-Fa Tsai,et al.  ACODF: a novel data clustering approach for data mining in large databases , 2004 .

[74]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[75]  Alex Alves Freitas,et al.  cAnt-Miner: An Ant Colony Classification Algorithm to Cope with Continuous Attributes , 2008, ANTS Conference.

[76]  Christian Blum,et al.  Beam-ACO - hybridizing ant colony optimization with beam search: an application to open shop scheduling , 2005, Comput. Oper. Res..

[77]  Hussein A. Abbass,et al.  Classification rule discovery with ant colony optimization , 2003, IEEE/WIC International Conference on Intelligent Agent Technology, 2003. IAT 2003..

[78]  Andries Petrus Engelbrecht,et al.  Data clustering using particle swarm optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[79]  Jose Miguel Puerta,et al.  Ant colony optimization for learning Bayesian networks , 2002, Int. J. Approx. Reason..

[80]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[81]  Thierry Roncalli,et al.  Internal Data, External Data and Consortium Data - How to Mix Them for Measuring Operational Risk , 2002 .

[82]  Q. Henry Wu,et al.  Group Search Optimizer: An Optimization Algorithm Inspired by Animal Searching Behavior , 2009, IEEE Transactions on Evolutionary Computation.

[83]  Alex Alves Freitas,et al.  Hierarchical classification of protein function with ensembles of rules and particle swarm optimisation , 2008, Soft Comput..

[84]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[85]  Alex Alves Freitas,et al.  Handling continuous attributes in Ant Colony Classification algorithms , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[86]  Tiago Ferra de Sousa,et al.  Particle Swarm based Data Mining Algorithms for classification tasks , 2004, Parallel Comput..

[87]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[88]  Julian Togelius,et al.  Geometric particle swarm optimization , 2008 .

[89]  M. Dorigo,et al.  1 Positive Feedback as a Search Strategy , 1991 .

[90]  Thomas Stützle,et al.  Frankenstein's PSO: A Composite Particle Swarm Optimization Algorithm , 2009, IEEE Transactions on Evolutionary Computation.

[91]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[92]  Andries P. Engelbrecht,et al.  Training support vector machines with particle swarms , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[93]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[94]  Martin Middendorf,et al.  A hierarchical particle swarm optimizer and its adaptive variant , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[95]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[96]  Christian Blum,et al.  An ant colony optimization algorithm for continuous optimization: application to feed-forward neural network training , 2007, Neural Computing and Applications.

[97]  Chandrasekhar Nataraj,et al.  Application of particle swarm optimization and proximal support vector machines for fault detection , 2009, Swarm Intelligence.

[98]  Thomas Stützle,et al.  Ant colony optimization and swarm intelligence : 4th International Workshop, ANTS 2004, Brussels, Belgium, September 5-8, 2004 : proceedings , 2004 .

[99]  Seid H. Pourtakdoust,et al.  An Extension of Ant Colony System to Continuous Optimization Problems , 2004, ANTS Workshop.

[100]  Thomas Stützle,et al.  Ant Colony Optimization: Overview and Recent Advances , 2018, Handbook of Metaheuristics.

[101]  Vittorio Maniezzo,et al.  Exact and Approximate Nondeterministic Tree-Search Procedures for the Quadratic Assignment Problem , 1999, INFORMS J. Comput..

[102]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[103]  Michael Sampels,et al.  A MAX-MIN Ant System for the University Course Timetabling Problem , 2002, Ant Algorithms.

[104]  Ling Chen,et al.  A novel ant clustering algorithm based on cellular automata , 2004, Proceedings. IEEE/WIC/ACM International Conference on Intelligent Agent Technology, 2004. (IAT 2004)..

[105]  R. Burkard,et al.  The Travelling Salesman Problem on Permuted , 1999 .

[106]  Saman K. Halgamuge,et al.  Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients , 2004, IEEE Transactions on Evolutionary Computation.

[107]  Martin Middendorf,et al.  A Population Based Approach for ACO , 2002, EvoWorkshops.

[108]  Bart Baesens,et al.  Ant-Based Approach to the Knowledge Fusion Problem , 2006, ANTS Workshop.

[109]  Roberto Montemanni,et al.  Ant Colony System for a Dynamic Vehicle Routing Problem , 2005, J. Comb. Optim..

[110]  Michelle Galea,et al.  Simultaneous Ant Colony Optimization Algorithms for Learning Linguistic Fuzzy Rules , 2006, Swarm Intelligence in Data Mining.

[111]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.