Contextual Distance for Data Perception

Structural perception of data plays a fundamental role in pattern analysis and machine learning. In this paper, we develop a new structural perception of data based on local contexts. We first identify the contextual set of a point by finding its nearest neighbors. Then the contextual distance between the point and one of its neighbors is defined by the difference between their contribution to the integrity of the geometric structure of the contextual set, which is depicted by a structural descriptor. The centroid and the coding length are introduced as the examples of descriptors of the contextual set. Furthermore, a directed graph (digraph) is built to model the asymmetry of perception. The edges of the digraph are weighted based on the contextual distances. Thus direction is brought to the undirected data. And the structural perception of data can be performed by mining the properties of the digraph. We also present the method for deriving the global digraph Laplacian from the alignment of the local digraph Laplacians. Experimental results on clustering and ranking of toy problems and real data show the superiority of asymmetric perception.

[1]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[2]  Shivani Agarwal,et al.  Ranking on graph data , 2006, ICML.

[3]  Yuandong Tian,et al.  EasyAlbum: an interactive photo annotation system based on face clustering and re-ranking , 2007, CHI.

[4]  Ulrike von Luxburg,et al.  Limits of Spectral Clustering , 2004, NIPS.

[5]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[6]  Zoubin Ghahramani,et al.  Spectral Methods for Automatic Multiscale Data Clustering , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[8]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[9]  Harry Shum,et al.  Classification via Minimum Incremental Coding Length , 2009, SIAM J. Imaging Sci..

[10]  John Wright,et al.  Segmentation of Multivariate Mixed Data via Lossy Data Coding and Compression , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  F. Chung Laplacians and the Cheeger Inequality for Directed Graphs , 2005 .

[12]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[13]  J. Bruner,et al.  Perceptual Identification and Perceptual Organization , 1955 .

[14]  Mikhail Belkin,et al.  Manifold Regularization : A Geometric Framework for Learning from Examples , 2004 .

[15]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[16]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[17]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Xiaogang Wang,et al.  Dual-space linear discriminant analysis for face recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[19]  H. Sebastian Seung,et al.  The Manifold Ways of Perception , 2000, Science.

[20]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Deli Zhao,et al.  Laplacian PCA and Its Applications , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[22]  Chris H. Q. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering , 2005, SDM.

[23]  Mukund Balasubramanian,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[24]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[25]  Serge J. Belongie,et al.  Higher order learning with graphs , 2006, ICML.

[26]  Yuandong Tian,et al.  A Face Annotation Framework with Partial Clustering and Interactive Labeling , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..