Why Topology for Machine Learning and Knowledge Extraction?

Data has shape, and shape is the domain of geometry and in particular of its “free” part, called topology. The aim of this paper is twofold. First, it provides a brief overview of applications of topology to machine learning and knowledge extraction, as well as the motivations thereof. Furthermore, this paper is aimed at promoting cross-talk between the theoretical and applied domains of topology and machine learning research. Such interactions can be beneficial for both the generation of novel theoretical tools and finding cutting-edge practical applications.

[1]  Gunnar E. Carlsson,et al.  Topological pattern recognition for point cloud data* , 2014, Acta Numerica.

[2]  Andreas Holzinger,et al.  Interactive Knowledge Discovery and Data Mining in Biomedical Informatics , 2014, Lecture Notes in Computer Science.

[3]  Alejandro Feged-Rivadeneira,et al.  Malaria intensity in Colombia by regions and populations , 2017, PloS one.

[4]  Afra Zomorodian,et al.  The Theory of Multidimensional Persistence , 2007, SCG '07.

[5]  Facundo Mémoli,et al.  Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition , 2007, PBG@Eurographics.

[6]  Leonidas J. Guibas,et al.  Proximity of persistence modules and their diagrams , 2009, SCG '09.

[7]  H. Edelsbrunner,et al.  Persistent Homology — a Survey , 2022 .

[8]  Peter Bubenik,et al.  Statistical topological data analysis using persistence landscapes , 2012, J. Mach. Learn. Res..

[9]  Andreas Uhl,et al.  Deep Learning with Topological Signatures , 2017, NIPS.

[10]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[11]  David Cohen-Steiner,et al.  Stability of Persistence Diagrams , 2005, Discret. Comput. Geom..

[12]  P. Y. Lum,et al.  Extracting insights from the shape of complex data using topology , 2013, Scientific Reports.

[13]  R. Ghrist Barcodes: The persistent topology of data , 2007 .

[14]  Vin de Silva,et al.  On the Local Behavior of Spaces of Natural Images , 2007, International Journal of Computer Vision.

[15]  G. Carlsson,et al.  Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival , 2011, Proceedings of the National Academy of Sciences.

[16]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[17]  Steve Oudot,et al.  The Structure and Stability of Persistence Modules , 2012, Springer Briefs in Mathematics.

[18]  M. Ferri,et al.  Betti numbers in multidimensional persistent homology are stable functions , 2013 .

[19]  Andreas Holzinger,et al.  On Computationally-Enhanced Visual Analysis of Heterogeneous Data and Its Application in Biomedical Informatics , 2014, Interactive Knowledge Discovery and Data Mining in Biomedical Informatics.

[20]  M. R. Casali,et al.  Topology in colored tensor models via crystallization theory , 2017, Journal of Geometry and Physics.

[21]  Patrizio Frosini G-invariant Persistent Homology , 2012, ArXiv.

[22]  Patrizio Frosini,et al.  On the use of size functions for shape analysis , 1993, [1993] Proceedings IEEE Workshop on Qualitative Vision.

[23]  Tamal K. Dey,et al.  Multiscale Mapper: Topological Summarization via Codomain Covers , 2016, SODA.

[24]  Andreas Holzinger,et al.  Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.

[25]  Andreas Holzinger,et al.  Human-Computer Interaction and Knowledge Discovery (HCI-KDD): What Is the Benefit of Bringing Those Two Fields to Work Together? , 2013, CD-ARES.

[26]  Elena K. Kandror,et al.  Single-cell topological RNA-Seq analysis reveals insights into cellular differentiation and development , 2017, Nature Biotechnology.

[27]  M. Ferri,et al.  One-dimensional reduction of multidimensional persistent homology , 2007, math/0702713.

[28]  Wei Guo,et al.  Toward automated prediction of manufacturing productivity based on feature selection using topological data analysis , 2016, 2016 IEEE International Symposium on Assembly and Manufacturing (ISAM).

[29]  Ulrich Bauer,et al.  A stable multi-scale kernel for topological machine learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Michael Farber,et al.  Invitation to Topological Robotics , 2008, Zurich Lectures in Advanced Mathematics.

[31]  Massimo Ferri,et al.  Persistent Topology for Natural Data Analysis - A Survey , 2017, BIRS-IMLKE.

[32]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .

[33]  Olaf Sporns,et al.  Towards a new approach to reveal dynamical organization of the brain using topological data analysis , 2018, Nature Communications.

[34]  Pietro Donatini,et al.  Natural pseudodistances between closed manifolds , 2004 .

[35]  Daniela Giorgi,et al.  Multidimensional Size Functions for Shape Comparison , 2008, Journal of Mathematical Imaging and Vision.

[36]  Vin de Silva,et al.  HOMOLOGICAL SENSOR NETWORKS , 2005 .

[37]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[38]  Coudriau Marc,et al.  Topological analysis and visualisation of network monitoring data: Darknet case study , 2016 .

[39]  Kenji Fukumizu,et al.  Persistence weighted Gaussian kernel for topological data analysis , 2016, ICML.

[40]  Hyeran Byun,et al.  A Survey on Pattern Recognition Applications of Support Vector Machines , 2003, Int. J. Pattern Recognit. Artif. Intell..

[41]  Mason A. Porter,et al.  A roadmap for the computation of persistent homology , 2015, EPJ Data Science.

[42]  Igor Jurisica,et al.  Visual Data Mining: Effective Exploration of the Biological Universe , 2014, Interactive Knowledge Discovery and Data Mining in Biomedical Informatics.

[43]  Patrizio Frosini,et al.  Size functions for signature recognition , 1998, Optics & Photonics.

[44]  Herbert Edelsbrunner,et al.  Computational Topology - an Introduction , 2009 .

[45]  Marcello Trovati,et al.  A Survey of Topological Data Analysis (TDA) Methods Implemented in Python , 2017, INCoS.

[46]  Kim Steenstrup Pedersen,et al.  The Nonlinear Statistics of High-Contrast Patches in Natural Images , 2003, International Journal of Computer Vision.

[47]  Kevin P. Knudson Morse Theory: Smooth and Discrete , 2015 .

[48]  S. Strogatz Exploring complex networks , 2001, Nature.

[49]  Patrizio Frosini,et al.  Point Selection: A New Comparison Scheme for Size Functions (With an Application to Monogram Recognition) , 1998, ACCV.