Persistent Homology-based Projection Pursuit

Dimensionality reduction problem is stated as finding a mapping $f:X \in {{\mathbb{R}}^m} \to Z \in {{\mathbb{R}}^n}$, where ⪡ m while preserving some relevant properties of the data. We formulate topology-preserving dimensionality reduction as finding the optimal orthogonal projection to the lower-dimensional subspace which minimizes discrepancy between persistent diagrams of the original data and the projection. This generalizes the classic projection pursuit algorithm which was originally designed to preserve the number of clusters, i.e. the 0-order topological invariant of the data. Our approach further allows to preserve k-th order invariants within the principled framework. We further pose the resulting optimization problem as the Riemannian optimization problem which allows for a natural and efficient solution.

[1]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[2]  Afra Zomorodian,et al.  Computing Persistent Homology , 2004, SCG '04.

[3]  Xavier Pennec,et al.  geomstats: a Python Package for Riemannian Geometry in Machine Learning , 2018, ArXiv.

[4]  Hossein Mobahi,et al.  Learning with a Wasserstein Loss , 2015, NIPS.

[5]  Ulrich Bauer Ripser: efficient computation of Vietoris-Rips persistence barcodes , 2019, ArXiv.

[6]  Mariette Yvinec,et al.  Geometric and Topological Inference , 2018 .

[7]  Steve Oudot,et al.  A Framework for Differential Calculus on Persistence Barcodes , 2019, ArXiv.

[8]  Leonidas J. Guibas,et al.  A Topology Layer for Machine Learning , 2019, AISTATS.

[9]  Niklas Koep,et al.  Pymanopt: A Python Toolbox for Optimization on Manifolds using Automatic Differentiation , 2016, J. Mach. Learn. Res..

[10]  Thomas Hofmann,et al.  Hyperbolic Neural Networks , 2018, NeurIPS.

[11]  Jason Altschuler,et al.  Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration , 2017, NIPS.

[12]  Steve Oudot,et al.  Sliced Wasserstein Kernel for Persistence Diagrams , 2017, ICML.

[13]  C. Villani Optimal Transport: Old and New , 2008 .

[14]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[15]  Herbert Edelsbrunner,et al.  Computational Topology - an Introduction , 2009 .

[16]  K. Borsuk On the imbedding of systems of compacta in simplicial complexes , 1948 .

[17]  Leonidas J. Guibas,et al.  Topology‐Aware Surface Reconstruction for Point Clouds , 2020, Comput. Graph. Forum.

[18]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[19]  Luc Van Gool,et al.  A Riemannian Network for SPD Matrix Learning , 2016, AAAI.

[20]  Lin Yan,et al.  Homology-Preserving Dimensionality Reduction via Manifold Landmarking and Tearing , 2018, ArXiv.

[21]  Steve Oudot,et al.  Persistence Theory - From Quiver Representations to Data Analysis , 2015, Mathematical surveys and monographs.

[22]  Antonino Staiano,et al.  Intrinsic dimension estimation: Advances and open problems , 2016, Inf. Sci..

[23]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[24]  Bamdev Mishra,et al.  Manopt, a matlab toolbox for optimization on manifolds , 2013, J. Mach. Learn. Res..

[25]  Bohua Zhan,et al.  Smooth Manifolds , 2021, Arch. Formal Proofs.

[26]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[27]  Marcio Gameiro,et al.  Continuation of Point Clouds via Persistence Diagrams , 2015, ArXiv.

[28]  Ken Sze-Wai Wong,et al.  Optimization on flag manifolds , 2019, Mathematical Programming.

[29]  Mariette Yvinec,et al.  The Gudhi Library: Simplicial Complexes and Persistent Homology , 2014, ICMS.

[30]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[31]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[32]  Alessandro Rudi,et al.  Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance , 2018, NeurIPS.

[33]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[34]  S. A. Barannikov,et al.  The framed Morse complex and its invariants , 1994 .

[35]  Maks Ovsjanikov,et al.  Topological Function Optimization for Continuous Shape Matching , 2018, Comput. Graph. Forum.

[36]  Levent Tunçel,et al.  Optimization algorithms on matrix manifolds , 2009, Math. Comput..

[37]  Zoubin Ghahramani,et al.  Unifying linear dimensionality reduction , 2014, 1406.0873.

[38]  Herbert Edelsbrunner,et al.  Topological Persistence and Simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[39]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.