The Use of Meta-Heuristic Algorithms for Data Mining

In this paper we explore the application of powerful optimisers known as metaheuristic algorithms to problems within the data mining domain. We introduce some well-known data mining problems, and show how they can be formulated as optimisation problems. We then review the use of metaheuristics in this context. In particular, we focus on the task of partial classification and show how multi-objective metaheuristics have produced results that are comparable to the best known techniques but more scalable to large databases. We conclude by reinforcing the importance of research on the areas of metaheuristics for optimisation and data mining. The combination of robust methods for solving real-life problems in a reasonable time and the ability to apply these methods to the analysis of large repositories of data may hold the key for success in many other scientific and commercial application areas.

[1]  Christian Blum,et al.  Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[2]  G. Di Caro,et al.  Ant colony optimization: a new meta-heuristic , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[3]  Hannu Toivonen,et al.  Data Mining In Bioinformatics , 2005 .

[4]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[5]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[6]  Victor J. Rayward-Smith,et al.  Discovering Knowledge in Commercial Databases Using Modern Heuristic Techniques , 1996, KDD.

[7]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[8]  Alex Alves Freitas,et al.  Data mining with an ant colony optimization algorithm , 2002, IEEE Trans. Evol. Comput..

[9]  Mauricio G. C. Resende,et al.  A Greedy Randomized Adaptive Search Procedure for Maximum Independent Set , 1994, Oper. Res..

[10]  Kalyanmoy Deb,et al.  A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[11]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[12]  Kenneth A. De Jong,et al.  Using genetic algorithms for concept learning , 1993, Machine Learning.

[13]  Patrick D. Surry,et al.  Fitness Variance of Formae and Performance Prediction , 1994, FOGA.

[14]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[15]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.