Particle Swarm Clustering in clustering ensembles: Exploiting pruning and alignment free consensus

Graphical abstractDisplay Omitted HighlightsA new consensus function based on the Particle Swarm Clustering algorithm.An alignment-free efficient representation for both disjoint and overlapping partitions.Employment of evolutionary operators for ensemble pruning. A clustering ensemble combines in a consensus function the partitions generated by a set of independent base clusterers. In this study both the employment of particle swarm clustering (PSC) and ensemble pruning (i.e., selective reduction of base partitions) using evolutionary techniques in the design of the consensus function is investigated. In the proposed ensemble, PSC plays two roles. First, it is used as a base clusterer. Second, it is employed in the consensus function; arguably the most challenging element of the ensemble. The proposed consensus function exploits a representation for the base partitions that makes cluster alignment unnecessary, allows for the combination of partitions with different number of clusters, and supports both disjoint and overlapping (fuzzy, probabilistic, and possibilistic) partitions. Results on both synthetic and real-world data sets show that the proposed ensemble can produce statistically significant better partitions, in terms of the validity indices used, than the best base partition available in the ensemble. In general, a small number of selected base partitions (below 20% of the total) yields the best results. Moreover, results produced by the proposed ensemble compare favorably to those of state-of-the-art clustering algorithms, and specially to swarm based clustering ensemble algorithms.

[1]  Ioannis T. Christou,et al.  Coordination of Cluster Ensembles via Exact Methods , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[3]  Mohamed S. Kamel,et al.  Multiple Cooperating Swarms for Data Clustering , 2007, 2007 IEEE Swarm Intelligence Symposium.

[4]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[5]  Eyke Hüllermeier,et al.  Comparing Fuzzy Partitions: A Generalization of the Rand Index and Related Measures , 2012, IEEE Transactions on Fuzzy Systems.

[6]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Ahmed Ali Abdalla Esmin,et al.  Consensus Clustering Based on Particle Swarm Optimization Algorithm , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[8]  Paulo Fazendeiro,et al.  Observer-Biased Fuzzy Clustering , 2015, IEEE Transactions on Fuzzy Systems.

[9]  Lawrence O. Hall,et al.  A scalable framework for cluster ensembles , 2009, Pattern Recognit..

[10]  Stan Matwin,et al.  A review on particle swarm optimization algorithm and its variants to clustering high-dimensional data , 2013, Artificial Intelligence Review.

[11]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[12]  Leandro Nunes de Castro,et al.  Fundamentals of Natural Computing - Basic Concepts, Algorithms, and Applications , 2006, Chapman and Hall / CRC computer and information science series.

[13]  Mohamed S. Kamel,et al.  Clustering ensemble using swarm intelligence , 2003, Proceedings of the 2003 IEEE Swarm Intelligence Symposium. SIS'03 (Cat. No.03EX706).

[14]  Witold Pedrycz,et al.  Advances in Fuzzy Clustering and its Applications , 2007 .

[15]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[16]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[17]  Leandro N. de Castro,et al.  Data Clustering with Particle Swarms , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[18]  Ludmila I. Kuncheva,et al.  Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Fakhri Karray,et al.  Particle swarm clustering ensemble , 2008, GECCO '08.

[20]  Ana L. N. Fred,et al.  Combining multiple clusterings using evidence accumulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Leandro Nunes de Castro,et al.  Fundamentals of Natural Computing (Chapman & Hall/Crc Computer and Information Sciences) , 2006 .

[22]  Sandro Vega-Pons,et al.  A Survey of Clustering Ensemble Algorithms , 2011, Int. J. Pattern Recognit. Artif. Intell..

[23]  Yi Hong,et al.  Resampling-based selective clustering ensembles , 2009, Pattern Recognit. Lett..

[24]  Monireh Abdoos,et al.  A New Efficient Approach in Clustering Ensembles , 2007, IDEAL.

[25]  Claudio Carpineto,et al.  Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Antoni Wibowo,et al.  Review of state of the art for metaheuristic techniques in Academic Scheduling Problems , 2013, Artificial Intelligence Review.

[27]  Gillian Dobbie,et al.  Research on particle swarm optimization based clustering: A systematic review of literature and techniques , 2014, Swarm Evol. Comput..

[28]  M. Mohammadi,et al.  Clustering Ensembles Using Genetic Algorithm , 2007, 2006 International Workshop on Computer Architecture for Machine Perception and Sensing.

[29]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[30]  Geoffrey H. Ball,et al.  ISODATA, A NOVEL METHOD OF DATA ANALYSIS AND PATTERN CLASSIFICATION , 1965 .

[31]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[32]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[33]  Licheng Jiao,et al.  Bagging-based spectral clustering ensemble selection , 2011, Pattern Recognit. Lett..

[34]  M. Analoui,et al.  Automatic Generation and Optimisation of Reconfigurable Financial Monte-Carlo Simulations , 2007, 2007 IEEE International Conf. on Application-specific Systems, Architectures and Processors (ASAP).

[35]  Li-ying Yang,et al.  Cluster Ensemble Based on Particle Swarm Optimization , 2009, 2009 WRI Global Congress on Intelligent Systems.

[36]  James C. Bezdek,et al.  On cluster validity for the fuzzy c-means model , 1995, IEEE Trans. Fuzzy Syst..

[37]  Yangyang Li,et al.  An improved method for multi-objective clustering ensemble algorithm , 2012, 2012 IEEE Congress on Evolutionary Computation.

[38]  Carl G. Looney,et al.  Interactive clustering and merging with a new fuzzy expected value , 2002, Pattern Recognit..

[39]  Joydeep Ghosh,et al.  A Survey of Consensus Clustering , 2015 .

[40]  Ganapati Panda,et al.  A survey on nature inspired metaheuristic algorithms for partitional clustering , 2014, Swarm Evol. Comput..