Learning with Partially Absorbing Random Walks

We propose a novel stochastic process that is with probability αi being absorbed at current state i, and with probability 1 – αi follows a random edge out of it. We analyze its properties and show its potential for exploring graph structures. We prove that under proper absorption rates, a random walk starting from a set S of low conductance will be mostly absorbed in S. Moreover, the absorption probabilities vary slowly inside S, while dropping sharply outside, thus implementing the desirable cluster assumption for graph-based learning. Remarkably, the partially absorbing process unifies many popular models arising in a variety of contexts, provides new insights into them, and makes it possible for transferring findings from one paradigm to another. Simulation results demonstrate its promising applications in retrieval and classification.

[1]  Pavel Berkhin,et al.  Bookmark-Coloring Algorithm for Personalized PageRank Computing , 2006, Internet Math..

[2]  Nicolas Le Roux,et al.  Label Propagation and Quadratic Criterion , 2006, Semi-Supervised Learning.

[3]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[4]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[5]  B. Nadler,et al.  Semi-supervised learning with the graph Laplacian: the limit of infinite unlabelled data , 2009, NIPS 2009.

[6]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[7]  B. Nadler,et al.  Diffusion maps, spectral clustering and reaction coordinates of dynamical systems , 2005, math/0503445.

[8]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[9]  John G. Kemeny,et al.  Finite Markov chains , 1960 .

[10]  Ling Huang,et al.  Semi-Supervised Learning with Max-Margin Graph Cuts , 2010, AISTATS.

[11]  Miklós Simonovits,et al.  The mixing rate of Markov chains, an isoperimetric inequality, and computing the volume , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[12]  Rahul Jain,et al.  Theory and Applications of Models of Computation , 2012, Lecture Notes in Computer Science.

[13]  Alexander Zien,et al.  Label Propagation and Quadratic Criterion , 2006 .

[14]  Ulrike von Luxburg,et al.  Hitting and commute times in large graphs are often misleading , 2010, 1003.1266.

[15]  Shang-Hua Teng,et al.  A Local Clustering Algorithm for Massive Graphs and Its Application to Nearly Linear Time Graph Partitioning , 2008, SIAM J. Comput..

[16]  Nathan Srebro,et al.  Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data , 2009, NIPS.

[17]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[18]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[19]  Fan Chung Graham,et al.  Detecting Sharp Drops in PageRank and a Simplified Local Partitioning Algorithm , 2007, TAMC.

[20]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[21]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[22]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[23]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[24]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[25]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.