A Generalized K-Means Algorithm with Semi-Supervised Weight Coefficients

A new classification algorithm corresponding to a generalization of the k-means algorithm is proposed, whose algorithm is named as a weighted k-means algorithm. Weight coefficients, which provide weighted distortions between data and cluster centers, are incorporated into the algorithm to realize reliable classification. A method determining the appropriate values of the weight coefficients from class labeled data is introduced. Under the situations where statistical distributions of data are changing gradually with time, the weighted k-means algorithm for semi-supervised data composed from initial labeled data and succeeding unlabeled data is investigated

[1]  A Gordon,et al.  Classification, 2nd Edition , 1999 .

[2]  Jian Yu,et al.  General C-Means Clustering Model , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Zhengdong Lu,et al.  Penalized Probabilistic Clustering , 2007, Neural Computation.

[4]  Dan Klein,et al.  From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.

[5]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[6]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[7]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[8]  Andrew J. Davenport,et al.  An empirical investigation into the exceptionally hard problems , 2001 .

[9]  Hong Chang,et al.  Locally linear metric adaptation for semi-supervised clustering , 2004, ICML.

[10]  D. Yeung,et al.  A Kernel Approach for Semi-Supervised Metric Learning , 2006 .

[11]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Agostino Tarsitano,et al.  A computational study of several relocation methods for k-means algorithms , 2003, Pattern Recognit..

[13]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[14]  Hong Chang,et al.  Extending the relevant component analysis algorithm for metric learning using both positive and negative equivalence constraints , 2006, Pattern Recognit..

[15]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[16]  Hong Chang,et al.  Relaxational metric adaptation and its application to semi-supervised clustering and content-based image retrieval , 2006, Pattern Recognit..

[17]  Tomer Hertz,et al.  Boosting margin based distance functions for clustering , 2004, ICML.

[18]  Krishna Kummamuru,et al.  Semisupervised Clustering with Metric Learning using Relative Comparisons , 2008, IEEE Trans. Knowl. Data Eng..

[19]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[20]  Jerome H. Friedman,et al.  Flexible Metric Nearest Neighbor Classification , 1994 .

[21]  Zhengdong Lu,et al.  Semi-supervised Learning with Penalized Probabilistic Clustering , 2004, NIPS.

[22]  Claire Cardie,et al.  Clustering with Instance-Level Constraints , 2000, AAAI/IAAI.

[23]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[24]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.

[26]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[27]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[28]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[29]  Paul Morris,et al.  The Breakout Method for Escaping from Local Minima , 1993, AAAI.

[30]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[31]  Boi Faltings,et al.  Open Constraint Satisfaction , 2002, CP.

[32]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[33]  Steven Minton,et al.  Minimizing Conflicts: A Heuristic Repair Method for Constraint Satisfaction and Scheduling Problems , 1992, Artif. Intell..

[34]  William H. Press,et al.  Numerical recipes in C , 2002 .

[35]  Larry D. Hostetler,et al.  Optimization of k nearest neighbor density estimates , 1973, IEEE Trans. Inf. Theory.

[36]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[37]  Jing Peng,et al.  Adaptive kernel metric nearest neighbor classification , 2002, Object recognition supported by user interaction for service robots.

[38]  Feiping Nie,et al.  Learning a Mahalanobis distance metric for data clustering and classification , 2008, Pattern Recognit..

[39]  Witold Pedrycz,et al.  Algorithms of fuzzy clustering with partial supervision , 1985, Pattern Recognit. Lett..

[40]  Zhihua Zhang,et al.  Parametric Distance Metric Learning with Label Information , 2003, IJCAI.

[41]  Arindam Banerjee,et al.  Semi-supervised Clustering by Seeding , 2002, ICML.

[42]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[43]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[44]  Dit-Yan Yeung,et al.  Locally linear metric adaptation with application to semi-supervised clustering and image retrieval , 2006, Pattern Recognit..

[45]  Hong Chang,et al.  A Scalable Kernel-Based Algorithm for Semi-Supervised Metric Learning , 2007, IJCAI.

[46]  Samuel Kaski,et al.  Clustering Based on Conditional Distributions in an Auxiliary Space , 2002, Neural Computation.

[47]  James C. Bezdek,et al.  Partially supervised clustering for image segmentation , 1996, Pattern Recognit..