Analyzing the Harmonic Structure in Graph-Based Learning

We find that various well-known graph-based models exhibit a common important harmonic structure in its target function - the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such structure help reveal important properties of the target function over a graph. In this paper, we show that the variation of the target function across a cut can be upper and lower bounded by the ratio of its harmonic loss and the cut cost. We use this to develop an analytical tool and analyze five popular graph-based models: absorbing random walks, partially absorbing random walks, hitting times, pseudo-inverse of the graph Laplacian, and eigenvectors of the Laplacian matrices. Our analysis sheds new insights into several open questions related to these models, and provides theoretical justifications and guidelines for their practical use. Simulations on synthetic and real datasets confirm the potential of the proposed theory and tool.

[1]  Mikhail Belkin,et al.  Toward Understanding Complex Spaces: Graph Laplacians on Manifolds with Singularities and Boundaries , 2012, COLT.

[2]  Ulrike von Luxburg,et al.  Hitting and commute times in large graphs are often misleading , 2010, 1003.1266.

[3]  Mikhail Belkin,et al.  Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.

[4]  Nathan Srebro,et al.  Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data , 2009, NIPS.

[5]  B. Nadler,et al.  Diffusion maps, spectral clustering and reaction coordinates of dynamical systems , 2005, math/0503445.

[6]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[7]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Peter G. Doyle,et al.  Random Walks and Electric Networks: REFERENCES , 1987 .

[9]  Guy Lever,et al.  Predicting the Labelling of a Graph via Minimum $p$-Seminorm Interpolation , 2009, COLT.

[10]  Kenneth Ward Church,et al.  Query suggestion using hitting time , 2008, CIKM '08.

[11]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[12]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[13]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[15]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[16]  M. Randic,et al.  Resistance distance , 1993 .

[17]  Mikhail Belkin,et al.  Semi-supervised Learning by Higher Order Regularization , 2011, AISTATS.

[18]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[19]  B. Nadler,et al.  Semi-supervised learning with the graph Laplacian: the limit of infinite unlabelled data , 2009, NIPS 2009.

[20]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[21]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[22]  Matthias Hein,et al.  Measure Based Regularization , 2003, NIPS.

[23]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[24]  Mikhail Belkin,et al.  Problems of learning on manifolds , 2003 .

[25]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[26]  Shih-Fu Chang,et al.  Learning with Partially Absorbing Random Walks , 2012, NIPS.

[27]  Ulrike von Luxburg,et al.  Phase transition in the family of p-resistances , 2011, NIPS.

[28]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.