In this paper we introduce new algorithms for unsupervised learning based on the use of a kernel matrix. All the information required by such algorithms is contained in the eigenvectors of the matrix or of closely related matrices. We use two different but related cost functions, the Alignment and the 'cut cost'. The first one is discussed in a companion paper [3], the second one is based on graph theoretic concepts. Both functions measure the level of clustering of a labeled dataset, or the correlation between data clusters and labels. We state the problem of unsupervised learning as assigning labels so as to optimize these cost functions. We show how the optimal solution can be approximated by slightly relaxing the corresponding optimization problem, and how this corresponds to using eigenvector information. The resulting simple algorithms are tested on real world data with positive results.
[1]
Alex Pothen,et al.
PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS*
,
1990
.
[2]
Prabhakar Raghavan,et al.
Sparse matrix reordering schemes for browsing hypertext
,
1996
.
[3]
Nello Cristianini,et al.
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
,
2000
.
[4]
N. Cristianini,et al.
On Kernel-Target Alignment
,
2001,
NIPS.
[5]
Dustin Boswell,et al.
Introduction to Support Vector Machines
,
2002
.