An effective hybrid crossover operator for genetic algorithms to solve k-means clustering problem

The k-means clustering problem is a famous problem with a variety of applications. It can be summarized as finding the best k representative centers for an input data set. K-means algorithm and its variations are known to be fast approximation iterative algorithms to the problem. However, several studies have shown that the genetic algorithm (GA) performs more favorably. In this paper, a new crossover operator for clustering GA is proposed. It combines string-coded crossover operator and real-coded crossover operator. Results from a series of experiments on benchmark data are quite encouraging, including that the newly proposed crossover operator performs better than both string-coded crossover operator and two versions of real-coded crossover operators. The way of coefficient selection for the combination is presented. In addition, the coding scheme and other genetic operations, such as selection and mutation, are discussed in detail.