Bi-stochastic kernels via asymmetric affinity functions

Abstract In this short letter we present the construction of a bi-stochastic kernel p for an arbitrary data set X that is derived from an asymmetric affinity function α. The affinity function α measures the similarity between points in X and some reference set Y. Unlike other methods that construct bi-stochastic kernels via some convergent iteration process or through solving an optimization problem, the construction presented here is quite simple. Furthermore, it can be viewed through the lens of out of sample extensions, making it useful for massive data sets.

[1]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[2]  Fei Wang,et al.  Improving clustering by learning a bi-stochastic data similarity matrix , 2011, Knowledge and Information Systems.

[3]  Richard Sinkhorn,et al.  Concerning nonnegative matrices and doubly stochastic matrices , 1967 .

[4]  B. Nadler,et al.  Diffusion maps, spectral clustering and reaction coordinates of dynamical systems , 2005, math/0503445.

[5]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[6]  D. Donoho,et al.  Hessian Eigenmaps : new locally linear embedding techniques for high-dimensional data , 2003 .

[7]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8]  R. Coifman,et al.  Geometric harmonics: A novel tool for multiscale out-of-sample extension of empirical functions , 2006 .

[9]  R. Coifman,et al.  Filtering via a Reference Set , 2011 .

[10]  R. Coifman,et al.  Anisotropic diffusion on sub-manifolds with application to Earth structure classification , 2012 .

[11]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  Amnon Shashua,et al.  A unifying approach to hard and probabilistic clustering , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[14]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.