Kernel functions based on triplet similarity comparisons

We propose two ways of defining a kernel function on a data set when the only available information about the data set are similarity triplets of the form "Object A is more similar to object B than to object C". Studying machine learning and data mining problems based on such restricted information has become very popular in recent years since it can easily be provided by humans via crowd sourcing. While previous approaches try to construct a low-dimensional Euclidean embedding of the data set that reflects the given similarity triplets, we aim at defining meaningful kernel functions on the data set that correspond to high-dimensional embeddings. These kernel functions can subsequently be used to apply all the standard kernel methods to solve tasks such as clustering, classification or principal component analysis on the data set.

[1]  Serge J. Belongie,et al.  Cost-Effective HITs for Relative Similarity Comparisons , 2014, HCOMP.

[2]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[3]  Jean-Philippe Vert,et al.  The Kendall and Mallows Kernels for Permutations , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Nicholas J. Higham,et al.  Exploiting fast matrix multiplication within the level 3 BLAS , 1990, TOMS.

[5]  Ulrike von Luxburg,et al.  Uniqueness of Ordinal Embedding , 2014, COLT.

[6]  Ulrike von Luxburg,et al.  Lens Depth Function and k-Relative Neighborhood Graph: Versatile Tools for Ordinal Data Analysis , 2016, J. Mach. Learn. Res..

[7]  Derek Greene,et al.  Practical solutions to the problem of diagonal dominance in kernel document clustering , 2006, ICML.

[8]  Bernhard Schölkopf,et al.  A Kernel Approach for Learning from Almost Orthogonal Patterns , 2002, European Conference on Principles of Data Mining and Knowledge Discovery.

[9]  Adam Tauman Kalai,et al.  Adaptively Learning the Crowd Kernel , 2011, ICML.

[10]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[11]  Raphael Yuster,et al.  Fast sparse matrix multiplication , 2004, TALG.

[12]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[13]  Robert D. Nowak,et al.  Low-dimensional embedding using adaptively selected ordinal data , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  Ehsan Amid,et al.  Multiview Triplet Embedding: Learning Attributes in Multiple Maps , 2015, ICML.

[15]  Ulrike von Luxburg,et al.  Local Ordinal Embedding , 2014, ICML.

[16]  Hannes Heikinheimo,et al.  Crowdsourced Nonparametric Density Estimation Using Relative Distances , 2015, HCOMP.

[17]  Kilian Q. Weinberger,et al.  Stochastic triplet embedding , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[18]  Ery Arias-Castro,et al.  Some theory for ordinal embedding , 2015, 1501.02861.

[19]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[20]  Hannes Heikinheimo,et al.  The Crowd-Median Algorithm , 2013, HCOMP.

[21]  David J. Kriegman,et al.  Generalized Non-metric Multidimensional Scaling , 2007, AISTATS.

[22]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.