Application of Genetic Algorithm to Cluster Analysis: Effectiveness of Operations with the Degree of Population Diversity

The aim of the present paper is to show effectiveness of Genetic Algorithm (GA) for cluster analysis problems in which the classification criteria are not the Euclidean distance. GA is known as a most effective method to solve combinatorial optimization problems by simulating the process of natural evolution and natural genetics. In a traditional GA, its dispersion of the evaluation function will be large, depending on the operators using random numbers or the control parameters. In this paper, for the purpose of reducing the dispersion of the evaluation function, a concept of the degree of population diversity is introduced as an index for an internal state of the whole population, and we use this index as a control parameter of the genetic operators such as crossover, mutation and selection. From numerical simulations, the following results are obtained: 1) The GA using the concept of the degree of population diversity is superior to the traditional GA with respect to the convergence of the evaluation function and the reduction of the variance of the evaluation function. 2) When data have observation noises, the GA using the proposed method shows robustness for data classification, while the classical method of clustering shows sensitive results.