Analysis of $p$-Laplacian Regularization in Semi-Supervised Learning

We investigate a family of regression problems in a semi-supervised setting. The task is to assign real-valued labels to a set of $n$ sample points, provided a small training subset of $N$ labeled points. A goal of semi-supervised learning is to take advantage of the (geometric) structure provided by the large number of unlabeled data when assigning labels. We consider random geometric graphs, with connection radius $\epsilon(n)$, to represent the geometry of the data set. Functionals which model the task reward the regularity of the estimator function and impose or reward the agreement with the training data. Here we consider the discrete $p$-Laplacian regularization. We investigate asymptotic behavior when the number of unlabeled points increases, while the number of training points remains fixed. We uncover a delicate interplay between the regularizing nature of the functionals considered and the nonlocality inherent to the graph constructions. We rigorously obtain almost optimal ranges on the scaling of $\epsilon(n)$ for the asymptotic consistency to hold. We prove that the minimizers of the discrete functionals in random setting converge uniformly to the desired continuum limit. Furthermore we discover that for the standard model used there is a restrictive upper bound on how quickly $\epsilon(n)$ must converge to zero as $n \to \infty$. We introduce a new model which is as simple as the original model, but overcomes this restriction.

[1]  M. Talagrand Upper and Lower Bounds for Stochastic Processes , 2021, Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge / A Series of Modern Surveys in Mathematics.

[2]  Ulrike von Luxburg,et al.  Phase transition in the family of p-resistances , 2011, NIPS.

[3]  Xu Wang,et al.  Spectral Convergence Rate of Graph Laplacian , 2015, 1510.08110.

[4]  János Komlós,et al.  On optimal matchings , 1984, Comb..

[5]  Abderrahim Elmoataz,et al.  Nonlocal PDEs on Graphs: From Tug-of-War Games to Unified Interpolation on Images and Point Clouds , 2017, Journal of Mathematical Imaging and Vision.

[6]  Ahmed El Alaoui,et al.  Asymptotic behavior of \(\ell_p\)-based Laplacian regularization in semi-supervised learning , 2016, COLT.

[7]  Sergei Ivanov,et al.  A graph discretization of the Laplace-Beltrami operator , 2013, 1301.2222.

[8]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[9]  S. Sethuraman,et al.  Consistency of modularity clustering on random geometric graphs , 2016, The Annals of Applied Probability.

[10]  Frank Thomson Leighton,et al.  Tight bounds for minimax grid matching, with applications to the average case analysis of algorithms , 1986, STOC '86.

[11]  V. Koltchinskii,et al.  Empirical graph Laplacian approximation of Laplace–Beltrami operators: Large sample results , 2006, math/0612777.

[12]  Ulrike von Luxburg,et al.  From Graphs to Manifolds - Weak and Strong Pointwise Consistency of Graph Laplacians , 2005, COLT.

[13]  L. Evans,et al.  Optimal Lipschitz extensions and the infinity laplacian , 2001 .

[14]  MATTHEW THORPE,et al.  TRANSPORTATION Lp DISTANCES: PROPERTIES AND EXTENSIONS , 2017 .

[15]  Nicolas Garcia Trillos,et al.  A new analytical approach to consistency and overfitting in regularized empirical risk minimization , 2016, European Journal of Applied Mathematics.

[16]  A. Singer From graph to manifold Laplacian: The convergence rate , 2006 .

[17]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[18]  G. M.,et al.  Partial Differential Equations I , 2023, Applied Mathematical Sciences.

[19]  I. Fonseca,et al.  Modern Methods in the Calculus of Variations: L^p Spaces , 2007 .

[20]  J. Dall,et al.  Random geometric graphs. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Zhen Li,et al.  A Convergent Point Integral Method for Isotropic Elliptic Equations on a Point Cloud , 2015, Multiscale Model. Simul..

[22]  Mikhail Belkin,et al.  Using manifold structure for partially labelled classification , 2002, NIPS 2002.

[23]  Matthias Hein,et al.  Uniform Convergence of Adaptive Graph-Based Regularization , 2006, COLT.

[24]  G. D. Maso,et al.  An Introduction to-convergence , 1993 .

[25]  J. Yukich,et al.  Minimax Grid Matching and Empirical Measures , 1991 .

[26]  Abderrahim Elmoataz,et al.  Non-Local Morphological PDEs and $p$-Laplacian Equation on Graphs With Applications in Image Processing and Machine Learning , 2012, IEEE Journal of Selected Topics in Signal Processing.

[27]  Ling Huang,et al.  An Analysis of the Convergence of Graph Laplacians , 2010, ICML.

[28]  Pierre Pudlo,et al.  Operator Norm Convergence of Spectral Clustering on Level Sets , 2010, J. Mach. Learn. Res..

[29]  C. Fefferman Fitting a Cm-smooth function to data, III , 2009 .

[30]  Nathan Srebro,et al.  Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data , 2009, NIPS.

[31]  Nicolás García Trillos,et al.  On the rate of convergence of empirical measures in $\infty$-transportation distance , 2014, 1407.1157.

[32]  Alfred O. Hero,et al.  A Hamilton-Jacobi Equation for the Continuum Limit of Nondominated Sorting , 2013, SIAM J. Math. Anal..

[33]  Nicolas Garcia Trillos,et al.  Variational Limits of k-NN Graph-Based Functionals on Data Clouds , 2016, SIAM J. Math. Data Sci..

[34]  Bernhard Schölkopf,et al.  Regularization on Discrete Spaces , 2005, DAGM-Symposium.

[35]  Mikhail Belkin,et al.  Semi-supervised Learning by Higher Order Regularization , 2011, AISTATS.

[36]  Jian Sun,et al.  Point Integral Method for Solving Poisson-type Equations on Manifolds from Point Clouds with Convergence Guarantees , 2014, 1409.2623.

[37]  Matthias Hein,et al.  Spectral clustering based on the graph p-Laplacian , 2009, ICML '09.

[38]  Nicolás García Trillos,et al.  Continuum Limit of Total Variation on Point Clouds , 2014, Archive for Rational Mechanics and Analysis.

[39]  Andrew M. Stuart,et al.  Uncertainty Quantification in Graph-Based Classification of High Dimensional Data , 2017, SIAM/ASA J. Uncertain. Quantification.

[40]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[41]  Amit Singer,et al.  Spectral Convergence of the connection Laplacian from random samples , 2013, 1306.1587.

[42]  Giovanni Alberti,et al.  A non-local anisotropic model for phase transitions: asymptotic behaviour of rescaled energies , 1998, European Journal of Applied Mathematics.

[43]  Mikhail Belkin,et al.  Convergence of Laplacian Eigenmaps , 2006, NIPS.

[44]  Martin J. Wainwright,et al.  Asymptotic behavior of ℓp-based Laplacian regularization in semi-supervised learning , 2016, ArXiv.

[45]  Abderrahim Elmoataz,et al.  On the p-Laplacian and ∞-Laplacian on Graphs with Applications in Image and Data Processing , 2015, SIAM J. Imaging Sci..

[46]  Gustavo K. Rohde,et al.  A Transportation Lp Distance for Signal Analysis , 2016, ArXiv.

[47]  Arie Israel,et al.  Fitting a Sobolev function to data I , 2014, 1411.1786.

[48]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[49]  Florian Theil,et al.  Asymptotic analysis of the Ginzburg–Landau functional on point clouds , 2016, Proceedings of the Royal Society of Edinburgh: Section A Mathematics.

[50]  Nicol´as Garc´ia Trillos,et al.  Variational Limits of K-nn Graph Based Functionals on Data Clouds , 2022 .

[51]  Xavier Bresson,et al.  Consistency of Cheeger and Ratio Graph Cuts , 2014, J. Mach. Learn. Res..

[52]  C. Villani Optimal Transport: Old and New , 2008 .

[53]  Andrea Braides Γ-convergence for beginners , 2002 .

[54]  Andrew M. Stuart,et al.  Uncertainty Quantification in the Classification of High Dimensional Data , 2017, ArXiv.

[55]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[56]  Bo'az Klartag,et al.  Fitting a $C^m$-Smooth Function to Data II , 2009 .

[57]  Dejan Slepcev,et al.  A variational approach to the consistency of spectral clustering , 2015, Applied and Computational Harmonic Analysis.

[58]  Florian Theil,et al.  Convergence of the k-Means Minimization Problem using Γ-Convergence , 2015, SIAM J. Appl. Math..

[59]  C. Fefferman,et al.  Fitting a Cm-Smooth Function to Data , 2005 .

[60]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[61]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.