论文信息 - Barcodes: The persistent topology of data

Barcodes: The persistent topology of data

This article surveys recent work of Carlsson and collaborators on applications of computational algebraic topology to problems of feature detection and shape recognition in high-dimensional data. The primary mathematical tool considered is a homology theory for point-cloud data sets — persistent homology — and a novel representation of this algebraic characterization — barcodes. We sketch an application of these techniques to the classification of natural images. 1. The shape of data When a topologist is asked, “How do you visualize a four-dimensional object?” the appropriate response is a Socratic rejoinder: “How do you visualize a threedimensional object?” We do not see in three spatial dimensions directly, but rather via sequences of planar projections integrated in a manner that is sensed if not comprehended. We spend a significant portion of our first year of life learning how to infer three-dimensional spatial data from paired planar projections. Years of practice have tuned a remarkable ability to extract global structure from representations in a strictly lower dimension. The inference of global structure occurs on much finer scales as well, with regards to converting discrete data into continuous images. Dot-matrix printers, scrolling LED tickers, televisions, and computer displays all communicate images via arrays of discrete points which are integrated into coherent, global objects. This also is a skill we have practiced from childhood. No adult does a dot-to-dot puzzle with anything approaching anticipation. 1.1. Topological data analysis. Problems of data analysis share many features with these two fundamental integration tasks: (1) how does one infer high dimensional structure from low dimensional representations; and (2) how does one assemble discrete points into global structure. The principal themes of this survey of the work of Carlsson, de Silva, Edelsbrunner, Harer, Zomorodian, and others are the following: (1) It is beneficial to replace a set of data points with a family of simplicial complexes, indexed by a proximity parameter. This converts the data set into global topological objects. (2) It is beneficial to view these topological complexes through the lens of algebraic topology — specifically, via a novel theory of persistent homology adapted to parameterized families. (3) It is beneficial to encode the persistent homology of a data set in the form of a parameterized version of a Betti number: a barcode. The author gratefully acknowledges the support of DARPA # HR0011-07-1-0002. The work reviewed in this article is funded by the DARPA program TDA: Topological Data Analysis.

R. Ghrist

[1] L. Vietoris. Über den höheren Zusammenhang kompakter Räume und eine Klasse von zusammenhangstreuen Abbildungen , 1927 .

[2] P. J. Green,et al. Density Estimation for Statistics and Data Analysis , 1987 .

[3] Herbert Edelsbrunner,et al. Three-dimensional alpha shapes , 1992, VVS.

[4] J. V. van Hateren,et al. Independent component filters of natural images compared with simple cells in primary visual cortex , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[5] J. H. Hateren,et al. Independent component filters of natural images compared with simple cells in primary visual cortex , 1998 .

[6] David Mumford,et al. Pattern Theory: the Mathematics of Perception , 2002, math/0212400.

[7] Herbert Edelsbrunner,et al. Topological Persistence and Simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[8] V. De Silva,et al. A Weak Definition of Delaunay Triangulation , 2003 .

[9] Vin de Silva. A weak definition of Delaunay triangulation , 2003, ArXiv.

[10] Leonidas J. Guibas,et al. Persistence barcodes for shapes , 2004, SGP '04.

[11] Gunnar E. Carlsson,et al. Topological estimation using witness complexes , 2004, PBG.

[12] Kim Steenstrup Pedersen,et al. The Nonlinear Statistics of High-Contrast Patches in Natural Images , 2003, International Journal of Computer Vision.

[13] Afra Zomorodian,et al. Computing Persistent Homology , 2004, SCG '04.

[14] David Cohen-Steiner,et al. Stability of Persistence Diagrams , 2005, Discret. Comput. Geom..

[15] Frédéric Chazal,et al. Weak feature size and persistent homology: computing homology of solids in Rn from noisy data samples , 2005, SCG.

[16] Peter Bubenik,et al. A statistical approach to persistent homology , 2006, math/0607634.

[17] Erik Carlsson,et al. c ○ World Scientific Publishing Company AN ALGEBRAIC TOPOLOGICAL METHOD FOR FEATURE IDENTIFICATION , 2022 .

[18] Vin de Silva,et al. On the Local Behavior of Spaces of Natural Images , 2007, International Journal of Computer Vision.

[19] Afra Zomorodian,et al. Localized Homology , 2007, IEEE International Conference on Shape Modeling and Applications 2007 (SMI '07).

[20] Leonidas J. Guibas,et al. Reconstruction Using Witness Complexes , 2007, SODA '07.

[21] Vin de Silva,et al. Coverage in sensor networks via persistent homology , 2007 .

[22] Afra Zomorodian,et al. The Theory of Multidimensional Persistence , 2007, SCG '07.

[23] R. Ho. Algebraic Topology , 2022 .