Clustering aims at finding hidden structure in data. In this paper we present a new clustering algorithm that builds upon the local and global consistency method (Zhou, et.al., 2003), a semi-supervised learning technique with the property of learning very smooth functions with respect to the intrinsic structure revealed by the data. Starting from this algorithm, we derive an optimization framework that discovers structure in data without requiring labeled data. This framework is capable of simultaneously optimizing all learning parameters, as well as picking the optimal number of clusters. It also allows easy detection of both global outliers and outliers within clusters. Finally, we show that the learned cluster models can be used to add previously unseen points to the clusters, without re-learning the original cluster model. Encouraging experimental results are obtained on a number of toy and real world problems.
[1]
Zoubin Ghahramani,et al.
Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions
,
2003,
ICML 2003.
[2]
Benjamin Naumann.
The Architecture Of Cognition
,
2016
.
[3]
Petra Perner,et al.
Data Mining - Concepts and Techniques
,
2002,
Künstliche Intell..
[4]
Jeff Shrager,et al.
Observation of Phase Transitions in Spreading Activation Networks
,
1987,
Science.
[5]
Bernhard Schölkopf,et al.
Learning with Local and Global Consistency
,
2003,
NIPS.
[6]
Bernhard Schölkopf,et al.
Ranking on Data Manifolds
,
2003,
NIPS.