Multiresolution persistent homology for excessively large biomolecular datasets.

Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibility-rigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273 780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarse-grained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications in arbitrary data sets, such as social networks, biological networks, and graphs.

[1]  Claudia Landi,et al.  A Mayer–Vietoris Formula for Persistent Homology with an Application to Shape Recognition in the Presence of Occlusions , 2011, Found. Comput. Math..

[2]  X. Liu,et al.  A fast algorithm for constructing topological structure in large data , 2012 .

[3]  R. Ghrist Barcodes: The persistent topology of data , 2007 .

[4]  Kelin Xia,et al.  Persistent topology for cryo‐EM data analysis , 2014, International journal for numerical methods in biomedical engineering.

[5]  Stephen Smale,et al.  A Topological View of Unsupervised Learning from Noisy Data , 2011, SIAM J. Comput..

[6]  Afra Zomorodian,et al.  Computing Persistent Homology , 2004, SCG '04.

[7]  Kelin Xia,et al.  Multiscale multiphysics and multidomain models--flexibility and rigidity. , 2013, The Journal of chemical physics.

[8]  Kelin Xia,et al.  Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis. , 2014, The Journal of chemical physics.

[9]  Moo K. Chung,et al.  Topology-Based Kernels With Application to Inference Problems in Alzheimer's Disease , 2011, IEEE Transactions on Medical Imaging.

[10]  David Cohen-Steiner,et al.  Computing geometry-aware handle and tunnel loops in 3D models , 2008, ACM Trans. Graph..

[11]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .

[12]  Tamal K. Dey,et al.  Reeb Graphs: Approximation and Persistence , 2013, Discret. Comput. Geom..

[13]  Bung-Nyun Kim,et al.  Persistent Brain Network Homology From the Perspective of Dendrogram , 2012, IEEE Transactions on Medical Imaging.

[14]  Afra Zomorodian,et al.  Computing Persistent Homology , 2005, Discret. Comput. Geom..

[15]  Guo-Wei Wei,et al.  Multidimensional persistence in biomolecular data , 2014, J. Comput. Chem..

[16]  L. Guibas,et al.  Topological methods for exploring low-density states in biomolecular folding pathways. , 2008, The Journal of chemical physics.

[17]  A. Atilgan,et al.  Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. , 1997, Folding & design.

[18]  Tamal K. Dey,et al.  Reeb Graphs: Approximation and Persistence , 2011, SoCG '11.

[19]  Kelin Xia,et al.  Persistent homology analysis of protein structure, flexibility, and folding , 2014, International journal for numerical methods in biomedical engineering.

[20]  Patrizio Frosini,et al.  Persistent Betti Numbers for a Noise Tolerant Shape-Based Approach to Image Retrieval , 2011, CAIP.

[21]  Guo-Wei Wei,et al.  Object-oriented persistent homology , 2016, J. Comput. Phys..

[22]  Kelin Xia,et al.  Communication: Capturing protein multiscale thermal fluctuations. , 2015, The Journal of chemical physics.

[23]  Herbert Edelsbrunner,et al.  Topological Persistence and Simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[24]  Joshua D. Reiss,et al.  Construction of symbolic dynamics from experimental time series , 1999 .

[25]  J Andrew McCammon,et al.  Feature-preserving adaptive mesh generation for molecular shape modeling and simulation. , 2008, Journal of molecular graphics & modelling.

[26]  Herbert Edelsbrunner,et al.  Computational Topology - an Introduction , 2009 .

[27]  Hubert Mara,et al.  Multivariate Data Analysis Using Persistence-Based Filtering and Topological Signatures , 2012, IEEE Transactions on Visualization and Computer Graphics.

[28]  Yiying Tong,et al.  Persistent homology for the quantitative prediction of fullerene stability , 2014, J. Comput. Chem..

[29]  Jian Sun,et al.  Computing geometry-aware handle and tunnel loops in 3D models , 2008, SIGGRAPH 2008.

[30]  Leonidas J. Guibas,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm250 Structural bioinformatics Persistent voids: a new structural metric for membrane fusion , 2022 .

[31]  J. Mccammon,et al.  Situs: A package for docking crystal structures into low-resolution maps from electron microscopy. , 1999, Journal of structural biology.

[32]  Konstantin Mischaikow,et al.  Morse Theory for Filtrations and Efficient Computation of Persistent Homology , 2013, Discret. Comput. Geom..

[33]  Abubakr Muhammad,et al.  Blind Swarms for Coverage in 2-D , 2005, Robotics: Science and Systems.

[34]  Danijela Horak,et al.  Persistent homology of complex networks , 2008, 0811.2203.

[35]  Herbert Edelsbrunner,et al.  Computing Robustness and Persistence for Images , 2010, IEEE Transactions on Visualization and Computer Graphics.

[36]  Vin de Silva,et al.  On the Local Behavior of Spaces of Natural Images , 2007, International Journal of Computer Vision.

[37]  M. Gameiro,et al.  Topological Measurement of Protein Compressibility via Persistence Diagrams , 2012 .

[38]  Ming C. Lin,et al.  Simulation-Based Joint Estimation of Body Deformation and Elasticity Parameters for Medical Image Analysis , 2012, IEEE Transactions on Medical Imaging.

[39]  D. Ringach,et al.  Topological analysis of population activity in visual cortex. , 2008, Journal of vision.

[40]  Patrizio Frosini,et al.  Size theory as a topological tool for computer vision , 1999 .

[41]  Valerio Pascucci,et al.  Branching and Circular Features in High Dimensional Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[42]  Patrizio Frosini,et al.  Persistent Betti numbers for a noise tolerant shape-based approach to image retrieval , 2011, Pattern Recognit. Lett..

[43]  Guo-Wei Wei Wavelets generated by using discrete singular convolution kernels , 2000 .