Laplacian Eigenmaps From Sparse, Noisy Similarity Measurements

Manifold learning and dimensionality reduction techniques are ubiquitous in science and engineering, but can be computationally expensive procedures when applied to large datasets or when similarities are expensive to compute. To date, little work has been done to investigate the tradeoff between computational resources and the quality of learned representations. We present both theoretical and experimental explorations of this question. In particular, we consider Laplacian eigenmaps embeddings based on a kernel matrix, and explore how the embeddings behave when this kernel matrix is corrupted by occlusion and noise. Our main theoretical result shows that under modest noise and occlusion assumptions, we can (with high probability) recover a good approximation to the Laplacian eigenmaps embedding based on the uncorrupted kernel matrix. Our results also show how regularization can aid this approximation. Experimentally, we explore the effects of noise and occlusion on Laplacian eigenmaps embeddings of two real-world datasets, one from speech processing and one from neuroscience, as well as a synthetic dataset.

[1]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[2]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[3]  F. Chung,et al.  Spectra of random graphs with given expected degrees , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[5]  Ritei Shibata,et al.  Consistency of model selection and parameter estimation , 1986, Journal of Applied Probability.

[6]  藤元政考,et al.  Robust Principle Component Analysisを用いたノイズ除去とコントラスト強調による歯科用内視鏡取得画像の視認性向上 , 2018 .

[7]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[8]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[9]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[10]  Fan Chung Graham,et al.  Concentration Inequalities and Martingale Inequalities: A Survey , 2006, Internet Math..

[11]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[12]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[13]  Fan Chung Graham,et al.  Spectral Clustering of Graphs with General Degrees in the Extended Planted Partition Model , 2012, COLT.

[14]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Alexander J. Smola,et al.  Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[16]  Michael Rabadi,et al.  Kernel Methods for Machine Learning , 2015 .

[17]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[18]  R. Oliveira Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges , 2009, 0911.0600.

[19]  Rajendra Bhatia,et al.  A Better Bound on the Variance , 2000, Am. Math. Mon..

[20]  Peter J. Bickel,et al.  Pseudo-likelihood methods for community detection in large sparse networks , 2012, 1207.2340.

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  S. Chatterjee,et al.  Matrix estimation by Universal Singular Value Thresholding , 2012, 1212.1247.

[23]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[25]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[26]  Xavier Bresson,et al.  Robust Principal Component Analysis on Graphs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  Bin Yu,et al.  Impact of regularization on spectral clustering , 2013, 2014 Information Theory and Applications Workshop (ITA).

[28]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[29]  Lothar Reichel,et al.  Augmented Implicitly Restarted Lanczos Bidiagonalization Methods , 2005, SIAM J. Sci. Comput..

[30]  V. Vu,et al.  Random perturbation of low rank matrices: Improving classical bounds , 2013, 1311.2657.

[31]  Ali Jalali,et al.  Low-Rank Matrix Recovery From Errors and Erasures , 2013, IEEE Transactions on Information Theory.

[32]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[33]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[34]  Tai Qin,et al.  Regularized Spectral Clustering under the Degree-Corrected Stochastic Blockmodel , 2013, NIPS.

[35]  M. Cugmas,et al.  On comparing partitions , 2015 .

[36]  Carey E. Priebe,et al.  Statistical Inference on Errorfully Observed Graphs , 2012, 1211.3601.

[37]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[38]  Uriel Feige,et al.  Spectral techniques applied to sparse random graphs , 2005, Random Struct. Algorithms.

[39]  Can M. Le,et al.  Sparse random graphs: regularization and concentration of the Laplacian , 2015, ArXiv.

[40]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[41]  Aren Jansen,et al.  Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[42]  A. Raftery,et al.  Variable Selection for Model-Based Clustering , 2006 .

[43]  Fan Chung Graham,et al.  The Spectra of Random Graphs with Given Expected Degrees , 2004, Internet Math..

[44]  Nathan Linial Finite metric spaces: combinatorics, geometry and algorithms , 2002, SCG '02.

[45]  Matthieu Marbac,et al.  Variable selection for model-based clustering using the integrated complete-data likelihood , 2017, Stat. Comput..

[46]  Aren Jansen,et al.  Rapid Evaluation of Speech Representations for Spoken Term Discovery , 2011, INTERSPEECH.

[47]  M. Brand,et al.  Fast low-rank modifications of the thin singular value decomposition , 2006 .

[48]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[49]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[50]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[51]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[52]  Matthias Hein,et al.  Manifold Denoising , 2006, NIPS.

[53]  Henry Wolkowicz,et al.  Solving Euclidean Distance Matrix Completion Problems Via Semidefinite Programming , 1999, Comput. Optim. Appl..

[54]  Yousef Saad,et al.  Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection , 2009, J. Mach. Learn. Res..

[55]  Piotr Indyk,et al.  Algorithmic applications of low-distortion geometric embeddings , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[56]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[57]  Aren Jansen,et al.  Segmental acoustic indexing for zero resource keyword search , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[58]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[59]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[60]  Vince Lyzinski,et al.  A joint graph inference case study: the C. elegans chemical and electrical connectomes , 2015, Worm.

[61]  A. Tsybakov,et al.  Oracle inequalities for network models and sparse graphon estimation , 2015, 1507.04118.

[62]  F. A. Lootsma Distance Matrix Completion by Numerical Optimization , 1997 .

[63]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[64]  Dit-Yan Yeung,et al.  Robust locally linear embedding , 2006, Pattern Recognit..

[65]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[66]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[67]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[68]  Can M. Le,et al.  Concentration and regularization of random graphs , 2015, Random Struct. Algorithms.

[69]  Adel Javanmard,et al.  Localization from Incomplete Noisy Distance Measurements , 2011, Foundations of Computational Mathematics.