Pattern Clustering Using a Swarm Intelligence Approach

Clustering aims at representing large datasets by a fewer number of prototypes or clusters. It brings simplicity in modeling data and thus plays a central role in the process of knowledge discovery and data mining. Data mining tasks, in these days, require fast and accurate partitioning of huge datasets, which may come with a variety of attributes or features. This, in turn, imposes severe computational requirements on the relevant clustering techniques. A family of bio-inspired algorithms, well-known as Swarm Intelligence (SI) has recently emerged that meets these requirements and has successfully been applied to a number of real world clustering problems. This chapter explores the role of SI in clustering different kinds of datasets. It finally describes a new SI technique for partitioning a linearly non-separable dataset into an optimal number of clusters in the kernel- induced feature space. Computer simulations undertaken in this research have also been provided to demonstrate the effectiveness of the proposed algorithm.

[1]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[2]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[3]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[4]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[5]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[6]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[7]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[8]  P. Brucker On the Complexity of Clustering Problems , 1978 .

[9]  T. M. Lillesand,et al.  Remote Sensing and Image Interpretation , 1980 .

[10]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[12]  B L Partridge,et al.  The structure and function of fish schools. , 1982, Scientific American.

[13]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[14]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[15]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[17]  Shokri Z. Selim,et al.  A simulated annealing algorithm for the clustering problem , 1991, Pattern Recognit..

[18]  Jean-Louis Deneubourg,et al.  The dynamics of collective sorting robot-like ants and ant-like robots , 1991 .

[19]  James C. Bezdek,et al.  Generalized clustering networks and Kohonen's self-organizing scheme , 1993, IEEE Trans. Neural Networks.

[20]  Mark M. Millonas,et al.  Swarms, Phase Transitions, and Collective Intelligence , 1993, adap-org/9306002.

[21]  Jing Wang,et al.  Swarm Intelligence in Cellular Robotic Systems , 1993 .

[22]  M.C. Clark,et al.  MRI segmentation using fuzzy clustering techniques , 1994, IEEE Engineering in Medicine and Biology Magazine.

[23]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[24]  Pascale Kuntz,et al.  Emergent colonization and graph partitioning , 1994 .

[25]  Baldo Faieta,et al.  Exploratory database analysis via self-organization , 1994 .

[26]  Baldo Faieta,et al.  Diversity and adaptation in populations of clustering ants , 1994 .

[27]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[28]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[29]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[30]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[31]  James C. Bezdek,et al.  Partially supervised clustering for image segmentation , 1996, Pattern Recognit..

[32]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[33]  Manish Sarkar,et al.  A clustering algorithm using an evolutionary programming-based approach , 1997, Pattern Recognit. Lett..

[34]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[35]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[36]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[37]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[38]  Emanuel Falkenauer,et al.  Genetic Algorithms and Grouping Problems , 1998 .

[39]  Pascale Kuntz,et al.  A Stochastic Heuristic for Visualising Graph Clusters in a Bi-Dimensional Space Prior to Partitioning , 1999, J. Heuristics.

[40]  Michael K. Ng,et al.  A fuzzy k-modes algorithm for clustering categorical data , 1999, IEEE Trans. Fuzzy Syst..

[41]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[42]  James C. Bezdek,et al.  Clustering with a genetically optimized approach , 1999, IEEE Trans. Evol. Comput..

[43]  D. Snyers,et al.  New results on an ant-based heuristic for highlighting the organization of large graphs , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[44]  R. Eberhart,et al.  Empirical study of particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[45]  Hichem Frigui,et al.  A Robust Competitive Clustering Algorithm With Applications in Computer Vision , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Christophe Rosenberger,et al.  Unsupervised clustering method with optimal estimation of the number of clusters: application to image segmentation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[47]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Christopher G. Langton,et al.  Artificial Life III , 2000 .

[49]  Lior Rokach,et al.  Theory and applications of attribute decomposition , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[50]  Xiaoyi Gao,et al.  Human population structure detection via multilocus genotype clustering , 2007, BMC Genetics.

[51]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[52]  Andries P. Engelbrecht,et al.  Effects of swarm size on Cooperative Particle Swarm Optimisers , 2001 .

[53]  I. Couzin,et al.  Collective memory and spatial sorting in animal groups. , 2002, Journal of theoretical biology.

[54]  Javier Ruiz-del-Solar,et al.  Soft computing systems : design, management and applications , 2002 .

[55]  Ujjwal Maulik,et al.  Genetic clustering for automatic evolution of clusters and application to image classification , 2002, Pattern Recognit..

[56]  R. R. Krausz Living in Groups , 2013 .

[57]  Julia Handl,et al.  Improved Ant-Based Clustering and Sorting , 2002, PPSN.

[58]  Barbara Webb,et al.  Swarm Intelligence: From Natural to Artificial Systems , 2002, Connect. Sci..

[59]  Rong Zhang,et al.  A large scale clustering scheme for kernel K-Means , 2002, Object recognition supported by user interaction for service robots.

[60]  Aly A. Farag,et al.  A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data , 2002, IEEE Transactions on Medical Imaging.

[61]  Sankar K. Pal,et al.  Data mining in soft computing framework: a survey , 2002, IEEE Trans. Neural Networks.

[62]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[63]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[64]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[65]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[66]  Weng-Kin Lai,et al.  Homogeneous Ants for Web Document Similarity Modeling and Categorization , 2002, Ant Algorithms.

[67]  Andries P. Engelbrecht,et al.  Image Classification using Particle Swarm Optimization , 2002, SEAL.

[68]  Pedro Pina,et al.  Self-Organized Data and Image Retrieval as a Consequence of Inter-Dynamic Synergistic Relationships in Artificial Ant Colonies , 2002, HIS.

[69]  Marco Dorigo,et al.  Ant-based clustering: a comparative study of its relative performance with respect to k-means, average link and 1d-som , 2003 .

[70]  Francesco Masulli,et al.  Soft Computing Applications , 2003 .

[71]  Parag M. Kanade,et al.  Fuzzy ants as a clustering concept , 2003, 22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003.

[72]  B. Jaumard,et al.  Cluster Analysis and Mathematical Programming , 2003 .

[73]  Andries Petrus Engelbrecht,et al.  Data clustering using particle swarm optimization , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[74]  Russell C. Eberhart,et al.  Gene clustering using self-organizing maps and particle swarm optimization , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[75]  Sandra Paterlini,et al.  Evolutionary Approaches for Cluster Analysis , 2003 .

[76]  Saman K. Halgamuge,et al.  Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients , 2004, IEEE Transactions on Evolutionary Computation.

[77]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[78]  Yadong Wang,et al.  Improving fuzzy c-means clustering based on feature-weight learning , 2004, Pattern Recognit. Lett..

[79]  M.-C. Su,et al.  A new cluster validity measure and its application to image compression , 2004, Pattern Analysis and Applications.

[80]  Juan Julián Merelo Guervós,et al.  Self-Organized Stigmergic Document Maps: Environment as a Mechanism for Context Learning , 2004, ArXiv.

[81]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[82]  T. Pitcher,et al.  The sensory basis of fish schools: Relative roles of lateral line and vision , 1980, Journal of comparative physiology.

[83]  Dao-Qiang Zhang,et al.  Clustering Incomplete Data Using Kernel-Based Fuzzy C-means Algorithm , 2003, Neural Processing Letters.

[84]  L. Dill,et al.  The three-dimensional structure of airborne bird flocks , 1978, Behavioral Ecology and Sociobiology.

[85]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[86]  Andries Petrus Engelbrecht,et al.  Particle swarm optimization method for image clustering , 2005, Int. J. Pattern Recognit. Artif. Intell..

[87]  Xiaohui Cui,et al.  Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm , 2005 .

[88]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[89]  Doheon Lee,et al.  A kernel-based subtractive clustering method , 2005, Pattern Recognit. Lett..

[90]  Sam Kwong,et al.  Ant Colony Clustering and Feature Extraction for Anomaly Intrusion Detection , 2006, Swarm Intelligence in Data Mining.

[91]  Lior Rokach,et al.  Decomposition methodology for classification tasks: a meta decomposer framework , 2006, Pattern Analysis and Applications.

[92]  Gilles Venturini,et al.  Data and Text Mining with Hierarchical Clustering Ants , 2006, Swarm Intelligence in Data Mining.

[93]  Youping Deng,et al.  SVM Classifier – a comprehensive java interface for support vector machine classification of microarray data , 2006, BMC Bioinformatics.

[94]  Daphna Weinshall,et al.  Learning a kernel function for classification with small training samples , 2006, ICML.

[95]  Ajith Abraham,et al.  Swarm Intelligence in Data Mining (Studies in Computational Intelligence) , 2006 .

[96]  Sandra Paterlini,et al.  Differential evolution and particle swarm optimisation in partitional clustering , 2006, Comput. Stat. Data Anal..

[97]  Ajith Abraham,et al.  Swarm Intelligence in Data Mining , 2009, Swarm Intelligence in Data Mining.

[98]  Andries P. Engelbrecht,et al.  Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification , 2007 .

[99]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[100]  Jan Peters,et al.  Computational Intelligence: Principles, Techniques and Applications , 2007, Comput. J..

[101]  Mauro Birattari,et al.  Swarm Intelligence , 2012, Lecture Notes in Computer Science.

[102]  Lior Rokach,et al.  Detection of unknown computer worms based on behavioral classification of the host , 2008, Comput. Stat. Data Anal..

[103]  Amit Konar,et al.  Automatic kernel clustering with a Multi-Elitist Particle Swarm Optimization Algorithm , 2008, Pattern Recognit. Lett..