Semi-supervised graph clustering: a kernel approach

Semi-supervised clustering algorithms aim to improve clustering results using limited supervision. The supervision is generally given as pairwise constraints; such constraints are natural for graphs, yet most semi-supervised clustering algorithms are designed for data represented as vectors. In this paper, we unify vector-based and graph-based approaches. We show that a recently-proposed objective function for semi-supervised clustering based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of constraint penalty functions, can be expressed as a special case of the weighted kernel k-means objective. A recent theoretical connection between kernel k-means and several graph clustering objectives enables us to perform semi-supervised clustering of data given either as vectors or as a graph. For vector data, the kernel approach also enables us to find clusters with non-linear boundaries in the input data space. Furthermore, we show that recent work on spectral learning (Kamvar et al., 2003) may be viewed as a special case of our formulation. We empirically show that our algorithm is able to outperform current state-of-the-art semi-supervised algorithms on both vector-based and graph-based data sets.

[1]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[2]  Claire Cardie,et al.  Constrained K-means Clustering with Background Knowledge , 2001, ICML.

[3]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[4]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[5]  Samuel Kaski,et al.  Clustering Based on Conditional Distributions in an Auxiliary Space , 2002, Neural Computation.

[6]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[7]  S. S. Ravi,et al.  Clustering with Constraints: Feasibility Issues and the k-Means Algorithm , 2005, SDM.

[8]  Joachim M. Buhmann,et al.  Learning with constrained and unlabelled data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Dan Klein,et al.  From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.

[10]  Zhengdong Lu,et al.  Semi-supervised Learning with Penalized Probabilistic Clustering , 2004, NIPS.

[11]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Nicole Immorlica,et al.  Approximation, Randomization, and Combinatorial Optimization.. Algorithms and Techniques , 2003, Lecture Notes in Computer Science.

[13]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[14]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[15]  Anil K. Jain,et al.  Model-based Clustering With Probabilistic Constraints , 2005, SDM.

[16]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[17]  R. Mooney,et al.  Impact of Similarity Measures on Web-page Clustering , 2000 .

[18]  Rong Yan,et al.  A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Hong Chang,et al.  Locally linear metric adaptation for semi-supervised clustering , 2004, ICML.

[20]  Arindam Banerjee,et al.  Semi-supervised Clustering by Seeding , 2002, ICML.

[21]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[22]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[23]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[24]  Mikhail Bilenko and Sugato Basu A Comparison of Inference Techniques for Semi-supervised Clustering with Hidden Markov Random Fields , 2004 .

[25]  Éva Tardos,et al.  Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[26]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[28]  Martine D. F. Schlag,et al.  Spectral K-way ratio-cut partitioning and clustering , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[29]  Nikhil Bansal,et al.  Correlation Clustering , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[30]  Nello Cristianini,et al.  Efficiently Learning the Metric with Side-Information , 2003, ALT.

[31]  Venkatesan Guruswami,et al.  Clustering with qualitative information , 2005, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[32]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[33]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[34]  Dan Klein,et al.  Spectral Learning , 2003, IJCAI.

[35]  Jianbo Shi,et al.  Segmentation given partial grouping constraints , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[37]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[38]  Shailesh V. Date,et al.  A Probabilistic Functional Network of Yeast Genes , 2004, Science.