Data Clustering: A User's Dilemma

Cluster analysis deals with the automatic discovery of the grouping of a set of patterns. Despite more than 40 years of research, there are still many challenges in data clustering from both theoretical and practical viewpoints. In this paper, we describe several recent advances in data clustering: clustering ensemble, feature selection, and clustering with constraints.

[1]  G.B. Coleman,et al.  Image segmentation by clustering , 1979, Proceedings of the IEEE.

[2]  Marina Meila,et al.  Comparing Clusterings by the Variation of Information , 2003, COLT.

[3]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[4]  Gerhard Rigoll,et al.  Writer Adaptation for Online Handwriting Recognition , 2001, DAGM-Symposium.

[5]  Anil K. Jain,et al.  Testing for Uniformity in Multidimensional Data , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Sandrine Dudoit,et al.  Applications of Resampling Methods to Estimate the Number of Clusters and to Improve the Accuracy of , 2001 .

[7]  Anil K. Jain,et al.  Automatic Construction of 2D Shape Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Marina Meila,et al.  A Comparison of Spectral Clustering Algorithms , 2003 .

[9]  W. Scott Spangler,et al.  Feature Weighting in k-Means Clustering , 2003, Machine Learning.

[10]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Jianbo Shi,et al.  Segmentation given partial grouping constraints , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Boris G. Mirkin,et al.  Reinterpreting the Category Utility Function , 2001, Machine Learning.

[15]  Daphne Koller,et al.  Using machine learning to improve information access , 1998 .

[16]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[17]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[19]  Anil K. Jain,et al.  Clustering techniques: The user's dilemma , 1976, Pattern Recognit..

[20]  C. S. Wallace,et al.  Estimation and Inference by Compact Coding , 1987 .

[21]  Ana L. N. Fred,et al.  Analysis of consensus partition in cluster ensemble , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[22]  Volker Roth,et al.  Feature Selection in Clustering Problems , 2003, NIPS.

[23]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[24]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  David G. Stork,et al.  Pattern Classification , 1973 .

[26]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[27]  Carla E. Brodley,et al.  Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach , 2003, ICML.

[28]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[29]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Joachim M. Buhmann,et al.  Path-Based Clustering for Grouping of Smooth Curves and Texture Segmentation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Fan Xiao,et al.  Uniformity testing using minimal spanning tree , 2002, Object recognition supported by user interaction for service robots.

[32]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[33]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[35]  Joachim M. Buhmann,et al.  Learning with constrained and unlabelled data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[36]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[37]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[38]  G. W. Hatfield,et al.  DNA microarrays and gene expression , 2002 .

[39]  A Gordon,et al.  Classification, 2nd Edition , 1999 .