论文信息 - A Gaussian Process Decoder with Spectral Mixtures and a Locally Estimated Manifold for Data Visualization

A Gaussian Process Decoder with Spectral Mixtures and a Locally Estimated Manifold for Data Visualization

Dimensionality reduction plays an important role in interpreting and visualizing high-dimensional data. Previous methods for data visualization overestimate the local structure and lack the consideration of global preservation. In this study, we develop a Gaussian process latent variable model (GP-LVM) for data visualization. GP-LVMs are one of the frameworks of principal component analysis and preserve the global structure effectively. The drawbacks of GP-LVMs are the absence of local structure preservation and the use of low-expressive kernel functions. Therefore, we introduce regularization for local preservation and an expressive kernel function into GP-LVMs to overcome these limitations. As a result, we reflect the global and local structures in low-dimensional representations, improving the reliability and visibility of embeddings. We conduct qualitative and quantitative experiments comparing baselines and state-of-the-art methods on image and text datasets.

M. Haseyama | Takahiro Ogawa | Keisuke Maeda | Koshi Watanabe

[1] C. Rudin,et al. Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMAP, and PaCMAP for Data Visualization , 2020, J. Mach. Learn. Res..

[2] Zhenqiu Liu,et al. Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis , 2020, International journal of molecular sciences.

[3] Philipp Berens,et al. Attraction-Repulsion Spectrum in Neighbor Embeddings , 2020, J. Mach. Learn. Res..

[4] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[5] Guillaume Hennequin,et al. Manifold GPLVMs for discovering non-Euclidean latent structure in neural data , 2020, NeurIPS.

[6] Ronald R. Coifman,et al. Visualizing structure and transitions in high-dimensional biological data , 2019, Nature Biotechnology.

[7] Andreas Kerren,et al. Toward a Quantitative Survey of Dimension Reduction Techniques , 2019, IEEE Transactions on Visualization and Computer Graphics.

[8] Deng Cai,et al. AtSNE: Efficient and Robust Visualization on GPU through Hierarchical Optimization , 2019, KDD.

[9] Johannes L. Schönberger,et al. SciPy 1.0: fundamental algorithms for scientific computing in Python , 2019, Nature Methods.

[10] Haibing Chen,et al. Wireless Indoor Localization Using Convolutional Neural Network and Gaussian Process Regression , 2019, Sensors.

[11] Lai Guan Ng,et al. Dimensionality reduction for visualizing single-cell data using UMAP , 2018, Nature Biotechnology.

[12] Philipp Berens,et al. The art of using t-SNE for single-cell transcriptomics , 2018, Nature Communications.

[13] Leland McInnes,et al. UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[14] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[15] Xiaoou Tang,et al. Surpassing Human-Level Face Verification Performance on LFW with GaussianFace , 2014, AAAI.

[16] Joshua B. Tenenbaum,et al. Automatic Construction and Natural-Language Description of Nonparametric Regression Models , 2014, AAAI.

[17] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Luis Gustavo Nonato,et al. Local Affine Multidimensional Projection , 2011, IEEE Transactions on Visualization and Computer Graphics.

[19] Wu-Jun Li,et al. Gaussian Process Latent Random Field , 2010, AAAI.

[20] David J. Fleet,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[21] Neil D. Lawrence,et al. Hierarchical Gaussian process latent variable models , 2007, ICML '07.

[22] Trevor Darrell,et al. Discriminative Gaussian process latent variable model for classification , 2007, ICML '07.

[23] B. Schölkopf,et al. Kernel methods in machine learning , 2007, math/0701907.

[24] Carl E. Rasmussen,et al. A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[25] Neil D. Lawrence,et al. Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[26] Mikhail Belkin,et al. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[27] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[28] Jorge Nocedal,et al. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[29] Paul Geladi,et al. Principal Component Analysis , 1987, Comprehensive Chemometrics.

[30] Neil D. Lawrence,et al. Variational Inference for Latent Variables and Uncertain Inputs in Gaussian Processes , 2016, J. Mach. Learn. Res..

[31] Laurens van der Maaten,et al. Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[32] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[33] Stéphane Lafon,et al. Diffusion maps , 2006 .

[34] Michael E. Tipping,et al. Probabilistic Principal Component Analysis , 1999 .

[35] H. Hotelling. Analysis of a complex of statistical variables into principal components. , 1933 .