A P ] 2 4 Ja n 20 20 A CONTINUUM LIMIT FOR THE PAGERANK ALGORITHM

Semi-supervised and unsupervised machine learning methods often rely on graphs to model data, prompting research on how theoretical properties of operators on graphs are leveraged in learning problems. While most of the existing literature focuses on undirected graphs, directed graphs are very important in practice, giving models for physical, biological, or transportation networks, among many other applications. In this paper, we propose a new framework for rigorously studying continuum limits of learning algorithms on directed graphs. We use the new framework to study the PageRank algorithm, and show how it can be interpreted as a numerical scheme on a directed graph involving a type of normalized graph Laplacian. We show that the corresponding continuum limit problem, which is taken as the number of webpages grows to infinity, is a second-order, possibly degenerate, elliptic equation that contains reaction, diffusion, and advection terms. We prove that the numerical scheme is consistent and stable and compute explicit rates of convergence of the discrete solution to the solution of the continuum limit PDE. We give applications to proving stability and asymptotic regularity of the PageRank vector.

[1]  G. M.,et al.  Partial Differential Equations I , 2023, Applied Mathematical Sciences.

[2]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[3]  Arjuna Flenner,et al.  Multiclass Data Segmentation Using Diffuse Interface Methods on Graphs , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Matthias Hein,et al.  Measure Based Regularization , 2003, NIPS.

[5]  Jeff Calder,et al.  The game theoretic p-Laplacian and semi-supervised learning with few labels , 2017, Nonlinearity.

[6]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[7]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[8]  James P. Keener,et al.  The Perron-Frobenius Theorem and the Ranking of Football Teams , 1993, SIAM Rev..

[9]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Nicol´as Garc´ia Trillos,et al.  Variational Limits of K-nn Graph Based Functionals on Data Clouds , 2022 .

[11]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[12]  Mikhail Belkin,et al.  An iterated graph laplacian approach for ranking on manifolds , 2011, KDD.

[13]  Mikhail Belkin,et al.  Convergence of Laplacian Eigenmaps , 2006, NIPS.

[14]  Xavier Bresson,et al.  Consistency of Cheeger and Ratio Graph Cuts , 2014, J. Mach. Learn. Res..

[15]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[16]  Zuoqiang Shi,et al.  Error estimation of weighted nonlocal Laplacian on random point cloud , 2018, 1809.08622.

[17]  Yang Wang,et al.  Multi-Manifold Ranking: Using Multiple Features for Better Image Retrieval , 2013, PAKDD.

[18]  Dejan Slepcev,et al.  Analysis of $p$-Laplacian Regularization in Semi-Supervised Learning , 2017, SIAM J. Math. Anal..

[19]  Nicolás García Trillos,et al.  Continuum Limit of Total Variation on Point Clouds , 2014, Archive for Rational Mechanics and Analysis.

[20]  Tong Zhang,et al.  Learning on Graph with Laplacian Regularization , 2006, NIPS.

[21]  Mikhail Belkin,et al.  Using Manifold Stucture for Partially Labeled Classification , 2002, NIPS.

[22]  Thomas Hofmann,et al.  Semi-supervised Learning on Directed Graphs , 2004, NIPS.

[23]  Mauricio Flores Rios,et al.  Algorithms for 𝓁p-based semi-supervised learning on graphs , 2019, ArXiv.

[24]  Taher H. Haveliwala,et al.  The Second Eigenvalue of the Google Matrix , 2003 .

[25]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[26]  Hau-Tieng Wu,et al.  Recovering Hidden Components in Multimodal Data with Composite Diffusion Operators , 2018, SIAM J. Math. Data Sci..

[27]  Sergei Ivanov,et al.  A graph discretization of the Laplace-Beltrami operator , 2013, 1301.2222.

[28]  JEFF CALDER,et al.  Improved spectral convergence rates for graph Laplacians on ε-graphs and k-NN graphs , 2022, Applied and Computational Harmonic Analysis.

[29]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Hau-Tieng Wu,et al.  Diffusion based Gaussian process regression via heat kernel reconstruction , 2019 .

[31]  Matthias Hein,et al.  Error Estimates for Spectral Convergence of the Graph Laplacian on Random Geometric Graphs Toward the Laplace–Beltrami Operator , 2018, Found. Comput. Math..

[32]  Mikhail Belkin,et al.  Towards a theoretical foundation for Laplacian-based manifold methods , 2005, J. Comput. Syst. Sci..

[33]  P. Lions,et al.  User’s guide to viscosity solutions of second order partial differential equations , 1992, math/9207212.

[34]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[35]  Zuoqiang Shi,et al.  Convergence of Laplacian spectra from random samples , 2015, Journal of Computational Mathematics.

[36]  Andrew M. Stuart,et al.  Spectral analysis of weighted Laplacians arising in data clustering , 2022, Applied and Computational Harmonic Analysis.

[37]  Ulrike von Luxburg,et al.  From Graphs to Manifolds - Weak and Strong Pointwise Consistency of Graph Laplacians , 2005, COLT.

[38]  The limit shape of convex hull peeling , 2018, ArXiv.

[39]  Jingrui He,et al.  Manifold-ranking based image retrieval , 2004, MULTIMEDIA '04.

[40]  Jeff Calder,et al.  Consistency of Lipschitz learning with infinite unlabeled data and finite labeled data , 2017, SIAM J. Math. Data Sci..

[41]  Jeff Calder,et al.  Rates of convergence for Laplacian semi-supervised learning with low labeling rates , 2020, Research in the Mathematical Sciences.

[42]  Braxton Osting,et al.  Consistency of Dirichlet Partitions , 2017, SIAM J. Math. Anal..

[43]  Dejan Slepcev,et al.  A variational approach to the consistency of spectral clustering , 2015, Applied and Computational Harmonic Analysis.

[44]  A. Singer From graph to manifold Laplacian: The convergence rate , 2006 .

[45]  Ryan Murray,et al.  A maximum principle argument for the uniform convergence of graph Laplacian regressors , 2019, SIAM J. Math. Data Sci..

[46]  P. Bassanini,et al.  Elliptic Partial Differential Equations of Second Order , 1997 .

[47]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[48]  Jingrui He,et al.  Generalized Manifold-Ranking-Based Image Retrieval , 2006, IEEE Transactions on Image Processing.

[49]  Alfred O. Hero,et al.  A Hamilton-Jacobi Equation for the Continuum Limit of Nondominated Sorting , 2013, SIAM J. Math. Anal..

[50]  Ulrike von Luxburg,et al.  Graph Laplacians and their Convergence on Random Neighborhood Graphs , 2006, J. Mach. Learn. Res..

[51]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[52]  Chun Chen,et al.  Efficient manifold ranking for image retrieval , 2011, SIGIR.

[53]  Arjuna Flenner,et al.  Diffuse Interface Models on Graphs for Classification of High Dimensional Data , 2012, SIAM Rev..

[54]  Dominique Zosso,et al.  A minimal surface criterion for graph partitioning , 2016 .

[55]  Ling Huang,et al.  An Analysis of the Convergence of Graph Laplacians , 2010, ICML.

[56]  David F. Gleich,et al.  PageRank beyond the Web , 2014, SIAM Rev..