Gene transposon based clonal selection algorithm for clustering

Inspired by the principle of gene transposon proposed by Barbara McClintock, a new immune computing algorithm for clustering multi-class data sets named as Gene Transposition based Clone Selection Algorithm (GTCSA) is proposed in this paper, The proposed algorithm does not require a prior knowledge of the numbers of clustering; an improved variant of the clonal selection algorithm has been used to determine the number of clusters as well as to refine the cluster center. a novel operator called antibody transposon is introduced to the framework of clonal selection algorithm which can realize to find the optimal number of cluster automatically. The proposed method has been extensively compared with Variable-string-length Genetic Algorithm(VGA)based clustering techniques over a test suit of several real life data sets and synthetic data sets. The results of experiments indicate the superiority of the GTCSA over VGA on stability and convergence rate, when clustering multi-class data sets.

[1]  I. Evangelou,et al.  Data Mining and Knowledge Discovery in Complex Image Data using Artificial Neural Networks , 2001 .

[2]  M. Rao Cluster Analysis and Mathematical Programming , 1971 .

[3]  B. Mcclintock The origin and behavior of mutable loci in maize , 1950, Proceedings of the National Academy of Sciences.

[4]  Jiang Wu,et al.  Clustering Without Prior Knowledge Based on Gene Expression Programming , 2007, Third International Conference on Natural Computation (ICNC 2007).

[5]  Daeryong Lee,et al.  Modified K-means algorithm for vector quantizer design , 1997, IEEE Signal Processing Letters.

[6]  B. Jaumard,et al.  Cluster Analysis and Mathematical Programming , 2003 .

[7]  Greg Hamerly,et al.  Learning the k in k-means , 2003, NIPS.

[8]  S. Bandyopadhyay,et al.  Nonparametric genetic clustering: comparison of validity indices , 2001, IEEE Trans. Syst. Man Cybern. Syst..

[9]  F. Burnet The clonal selection theory of acquired immunity , 1959 .

[10]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[11]  David Botstein,et al.  McClintock Myths. (Book Reviews: The Dynamic Genome. Barbara McClintock's Ideas in the Century of Genetics.) , 1992 .

[12]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[13]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Licheng Jiao,et al.  Clonal Strategy Algorithm Based on the Immune Memory , 2005, Journal of Computer Science and Technology.

[15]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[16]  Hichem Frigui,et al.  A Robust Competitive Clustering Algorithm With Applications in Computer Vision , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  M. Narasimha Murty,et al.  Clustering with evolution strategies , 1994, Pattern Recognit..

[18]  Fernando José Von Zuben,et al.  An Evolutionary Immune Network for Data Clustering , 2000, SBRN.

[19]  Andrew P. Read The Dynamic Genome - Barbara McClintock's Ideas in the Century of Genetics , 1993 .

[20]  Leandro Nunes de Castro,et al.  The Clonal Selection Algorithm with Engineering Applications 1 , 2000 .

[21]  Kim-Fung Man,et al.  A Jumping Gene Paradigm for Evolutionary Multiobjective Optimization , 2008, IEEE Transactions on Evolutionary Computation.

[22]  Kim-Fung Man,et al.  A Jumping Gene Algorithm for Multiobjective Resource Management in Wideband CDMA Systems , 2005, Comput. J..

[23]  Kuldip K. Paliwal,et al.  Comments on "modified K-means algorithm for vector quantizer design" , 2000, IEEE Trans. Image Process..