Recent developments in clustering algorithms

In this paper, we give a short review of recent developments in clustering. We shortly summarize important clustering paradigms before addressing important topics including metric adaptation in clustering, dealing with non-Euclidean data or large data sets, clustering evaluation, and learning theoretical foundations.

[1]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[2]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  James M. Keller,et al.  A possibilistic fuzzy c-means clustering algorithm , 2005, IEEE Transactions on Fuzzy Systems.

[4]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[5]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[6]  K. alik,et al.  Validity index for clusters of different sizes and densities , 2011 .

[7]  Charu C. Aggarwal,et al.  A Framework for Clustering Massive-Domain Data Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[8]  Ulrike von Luxburg,et al.  Clustering Stability: An Overview , 2010, Found. Trends Mach. Learn..

[9]  G. Celeux,et al.  Variable Selection for Clustering with Gaussian Mixture Models , 2009, Biometrics.

[10]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Mark A. Girolami,et al.  Mercer kernel-based clustering in feature space , 2002, IEEE Trans. Neural Networks.

[12]  Ulrike von Luxburg,et al.  Limits of Spectral Clustering , 2004, NIPS.

[13]  Shai Ben-David,et al.  Towards Property-Based Classification of Clustering Paradigms , 2010, NIPS.

[14]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.

[15]  Shai Ben-David A Framework for Statistical Clustering with a Constant Time Approximation Algorithms for K-Median Clustering , 2004, COLT.

[16]  P. Hall,et al.  Defining probability density for a distribution of random functions , 2010, 1002.4931.

[17]  Charles Bouveyron,et al.  Model-based clustering of time series in group-specific functional subspaces , 2011, Adv. Data Anal. Classif..

[18]  Meena Mahajan,et al.  The Planar k-means Problem is NP-hard I , 2009 .

[19]  John Shawe-Taylor,et al.  Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada , 2009, NIPS.

[20]  Doheon Lee,et al.  On cluster validity index for estimation of the optimal number of fuzzy clusters , 2004, Pattern Recognit..

[21]  William W. Cohen,et al.  Power Iteration Clustering , 2010, ICML.

[22]  Robert Tibshirani,et al.  A Framework for Feature Selection in Clustering , 2010, Journal of the American Statistical Association.

[23]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[24]  Antti Ukkonen,et al.  Clustering Algorithms for Chains , 2011, J. Mach. Learn. Res..

[25]  Alessio Micheli,et al.  A general framework for unsupervised processing of structured data , 2004, Neurocomputing.

[26]  Geoffrey C. Fox,et al.  A deterministic annealing approach to clustering , 1990, Pattern Recognit. Lett..

[27]  Francesco Camastra,et al.  A novel kernel method for clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[30]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[31]  Geoffrey J. McLachlan,et al.  Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..

[32]  Shai Ben-David,et al.  Measures of Clustering Quality: A Working Set of Axioms for Clustering , 2008, NIPS.

[33]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[34]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[35]  J. O. Ramsay,et al.  Functional Data Analysis (Springer Series in Statistics) , 1997 .

[36]  Fabrice Rossi,et al.  Modularity-based Clustering for Network-constrained Trajectories , 2012, ESANN.

[37]  Thomas Villmann,et al.  Modified Conn-Index for the evaluation of fuzzy clusterings , 2012, ESANN.

[38]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[39]  James C. Bezdek,et al.  Nerf c-means: Non-Euclidean relational fuzzy clustering , 1994, Pattern Recognit..

[40]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .

[41]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[42]  Facundo Mémoli,et al.  Department of Mathematics , 1894 .

[43]  Jon M. Kleinberg,et al.  An Impossibility Theorem for Clustering , 2002, NIPS.

[44]  Jiawei Han,et al.  CLARANS: A Method for Clustering Objects for Spatial Data Mining , 2002, IEEE Trans. Knowl. Data Eng..

[45]  Hujun Yin,et al.  On the equivalence between kernel self-organising maps and self-organising mixture density networks , 2006, Neural Networks.

[46]  G. Wahba Spline models for observational data , 1990 .

[47]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[48]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[49]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[50]  Fabrice Rossi,et al.  Dissimilarity Clustering by Hierarchical Multi-Level Refinement , 2012, ESANN.

[51]  Pranjal Awasthi,et al.  Supervised Clustering , 2010, NIPS.

[52]  Michael I. Jordan,et al.  Mixtures of Probabilistic Principal Component Analyzers , 2001 .

[53]  Fabrice Rossi,et al.  A Discussion on Parallelization Schemes for Stochastic Vector Quantization Algorithms , 2012, ESANN.

[54]  Christos Faloutsos,et al.  Clustering very large multi-dimensional datasets with MapReduce , 2011, KDD.

[55]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[56]  Joachim M. Buhmann,et al.  Pairwise Data Clustering by Deterministic Annealing , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Barbara Hammer,et al.  Local matrix adaptation in topographic neural maps , 2011, Neurocomputing.

[58]  Hongyuan Zha,et al.  A new Mallows distance based metric for comparing clusterings , 2005, ICML '05.

[59]  Cordelia Schmid,et al.  High-dimensional data clustering , 2006, Comput. Stat. Data Anal..

[60]  Joydeep Ghosh,et al.  A Unified Framework for Model-based Clustering , 2003, J. Mach. Learn. Res..

[61]  Paul D. McNicholas,et al.  Parsimonious Gaussian mixture models , 2008, Stat. Comput..

[62]  Barbara Hammer,et al.  Topographic Mapping of Large Dissimilarity Data Sets , 2010, Neural Computation.

[63]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[64]  Romain Boulet DISJOINT UNIONS OF COMPLETE GRAPHS CHARACTERIZED BY THEIR LAPLACIAN SPECTRUM , 2009 .

[65]  Ulrike von Luxburg,et al.  Influence of graph construction on graph-based clustering measures , 2008, NIPS.

[66]  Michael K. Ng,et al.  A fuzzy k-modes algorithm for clustering categorical data , 1999, IEEE Trans. Fuzzy Syst..

[67]  Michael K. Ng,et al.  Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68]  P. Deb Finite Mixture Models , 2008 .

[69]  Borut Zalik,et al.  Validity index for clusters of different sizes and densities , 2011, Pattern Recognit. Lett..

[70]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[71]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[72]  Sylvia Kaufmann,et al.  Model-Based Clustering of Multiple Time Series , 2004 .

[73]  Andrea Vattani,et al.  k-means Requires Exponentially Many Iterations Even in the Plane , 2008, SCG '09.

[74]  Brendan J. Frey,et al.  Hierarchical Affinity Propagation , 2011, UAI.

[75]  Wei Pan,et al.  Penalized Model-Based Clustering with Application to Variable Selection , 2007, J. Mach. Learn. Res..

[76]  Fabrice Rossi,et al.  Batch kernel SOM and related Laplacian methods for social network analysis , 2008, Neurocomputing.

[77]  Ulrike von Luxburg,et al.  Pruning nearest neighbor cluster trees , 2011, ICML.

[78]  Miin-Shen Yang,et al.  A cluster validity index for fuzzy clustering , 2005, Pattern Recognit. Lett..

[79]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[80]  Blatt,et al.  Superparamagnetic clustering of data. , 1998, Physical review letters.

[81]  Thomas Martinetz,et al.  'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.

[82]  A. Raftery,et al.  Variable Selection for Model-Based Clustering , 2006 .

[83]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[84]  Julien Jacques,et al.  Curves clustering with approximation of the density of functional random variables , 2012, ESANN.

[85]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[86]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[87]  Erzsébet Merényi,et al.  A Validity Index for Prototype-Based Clustering of Data Sets With Complex Cluster Structures , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[88]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Marina Meila,et al.  Comparing clusterings: an axiomatic view , 2005, ICML.

[90]  Isabelle Guyon,et al.  Clustering: Science or Art? , 2009, ICML Unsupervised and Transfer Learning.

[91]  Thomas Villmann,et al.  Batch and median neural gas , 2006, Neural Networks.

[92]  Catherine A. Sugar,et al.  Clustering for Sparsely Sampled Functional Data , 2003 .

[93]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[94]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[95]  Amaury Lendasse,et al.  Relevance learning for time series inspection , 2012, ESANN.

[96]  Thaddeus Tarpey,et al.  Clustering Functional Data , 2003, J. Classif..

[97]  Thomas Villmann,et al.  Median fuzzy c-means for clustering dissimilarity data , 2010, Neurocomputing.

[98]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[99]  Adam Meyerson,et al.  Fast and Accurate k-means For Large Datasets , 2011, NIPS.

[100]  Piotr Indyk,et al.  Approximate clustering via core-sets , 2002, STOC '02.

[101]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[102]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[103]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..