Multiscale Persistent Functions for Biomolecular Structure Characterization

In this paper, we introduce multiscale persistent functions for biomolecular structure characterization. The essential idea is to combine our multiscale rigidity functions (MRFs) with persistent homology analysis, so as to construct a series of multiscale persistent functions, particularly multiscale persistent entropies, for structure characterization. To clarify the fundamental idea of our method, the multiscale persistent entropy (MPE) model is discussed in great detail. Mathematically, unlike the previous persistent entropy (Chintakunta et al. in Pattern Recognit 48(2):391–401, 2015; Merelli et al. in Entropy 17(10):6872–6892, 2015; Rucco et al. in: Proceedings of ECCS 2014, Springer, pp 117–128, 2016), a special resolution parameter is incorporated into our model. Various scales can be achieved by tuning its value. Physically, our MPE can be used in conformational entropy evaluation. More specifically, it is found that our method incorporates in it a natural classification scheme. This is achieved through a density filtration of an MRF built from angular distributions. To further validate our model, a systematical comparison with the traditional entropy evaluation model is done. It is found that our model is able to preserve the intrinsic topological features of biomolecular data much better than traditional approaches, particularly for resolutions in the intermediate range. Moreover, by comparing with traditional entropies from various grid sizes, bond angle-based methods and a persistent homology-based support vector machine method (Cang et al. in Mol Based Math Biol 3:140–162, 2015), we find that our MPE method gives the best results in terms of average true positive rate in a classic protein structure classification test. More interestingly, all-alpha and all-beta protein classes can be clearly separated from each other with zero error only in our model. Finally, a special protein structure index (PSI) is proposed, for the first time, to describe the “regularity” of protein structures. Basically, a protein structure is deemed as regular if it has a consistent and orderly configuration. Our PSI model is tested on a database of 110 proteins; we find that structures with larger portions of loops and intrinsically disorder regions are always associated with larger PSI, meaning an irregular configuration, while proteins with larger portions of secondary structures, i.e., alpha-helix or beta-sheet, have smaller PSI. Essentially, PSI can be used to describe the “regularity” information in any systems.

[1]  R. Bowen TOPOLOGICAL ENTROPY FOR NONCOMPACT SETS , 1973 .

[2]  M. Levitt,et al.  Computer simulation of protein folding , 1975, Nature.

[3]  Arnold T. Hagler,et al.  COMPUTER SIMULATION OF THE CONFORMATIONAL PROPERTIES OF OLIGOPEPTIDES. COMPARISON OF THEORETICAL METHODS AND ANALYSIS OF EXPERIMENTAL RESULTS , 1979 .

[4]  M. Karplus,et al.  Method for estimating the configurational entropy of macromolecules , 1981 .

[5]  James R. Munkres,et al.  Elements of algebraic topology , 1984 .

[6]  Herbert Edelsbrunner,et al.  Three-dimensional alpha shapes , 1992, VVS.

[7]  T. Pollard,et al.  Annual review of biophysics and biomolecular structure , 1992 .

[8]  M J Sternberg,et al.  Side‐chain conformational entropy in protein folding , 1995, Protein science : a publication of the Protein Society.

[9]  W. Stites,et al.  Empirical evaluation of the influence of side chains on the conformational entropy of the polypeptide backbone , 1995, Proteins.

[10]  Samuel H. Gellman,et al.  Introduction: Molecular Recognition. , 1997, Chemical reviews.

[11]  K. Sharp,et al.  Entropy in protein folding and in protein-protein interactions. , 1997, Current opinion in structural biology.

[12]  Patrizio Frosini,et al.  Size theory as a topological tool for computer vision , 1999 .

[13]  Joshua D. Reiss,et al.  Construction of symbolic dynamics from experimental time series , 1999 .

[14]  H. Hansma,et al.  The backbone conformational entropy of protein folding: experimental measures from atomic force microscopy. , 2002, Journal of molecular biology.

[15]  B. Halle,et al.  Flexibility and packing in proteins , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Herbert Edelsbrunner,et al.  Topological Persistence and Simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[17]  Natasja Brooijmans,et al.  Molecular recognition and docking algorithms. , 2003, Annual review of biophysics and biomolecular structure.

[18]  J. Fitter A measure of conformational entropy change during thermal protein unfolding using neutron spectroscopy. , 2003, Biophysical journal.

[19]  Afra Zomorodian,et al.  Computing Persistent Homology , 2004, SCG '04.

[20]  坂上 貴之 書評 Computational Homology , 2005 .

[21]  Michael K Gilson,et al.  Evaluating the Accuracy of the Quasiharmonic Approximation. , 2005, Journal of chemical theory and computation.

[22]  Abubakr Muhammad,et al.  Blind Swarms for Coverage in 2-D , 2005, Robotics: Science and Systems.

[23]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[24]  Jeremy M Moix,et al.  Dihedral-angle information entropy as a gauge of secondary structure propensity. , 2006, Biophysical journal.

[25]  David Cohen-Steiner,et al.  Vines and vineyards by updating persistence in linear time , 2006, SCG '06.

[26]  Peter Bubenik,et al.  A statistical approach to persistent homology , 2006, math/0607634.

[27]  Vin de Silva,et al.  On the Local Behavior of Spaces of Natural Images , 2007, International Journal of Computer Vision.

[28]  Leonidas J. Guibas,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm250 Structural bioinformatics Persistent voids: a new structural metric for membrane fusion , 2022 .

[29]  Afra Zomorodian,et al.  Localized Homology , 2007, IEEE International Conference on Shape Modeling and Applications 2007 (SMI '07).

[30]  A. Wand,et al.  Conformational entropy in molecular recognition by proteins , 2007, Nature.

[31]  R. Ghrist Barcodes: The persistent topology of data , 2007 .

[32]  Afra Zomorodian,et al.  The Theory of Multidimensional Persistence , 2007, SCG '07.

[33]  D. Ringach,et al.  Topological analysis of population activity in visual cortex. , 2008, Journal of vision.

[34]  Daniela Giorgi,et al.  Describing shapes by geometrical-topological properties of real functions , 2008, CSUR.

[35]  David Cohen-Steiner,et al.  Computing geometry-aware handle and tunnel loops in 3D models , 2008, ACM Trans. Graph..

[36]  Jie Liang,et al.  Discrete state model and accurate estimation of loop entropy of RNA secondary structures. , 2008, The Journal of chemical physics.

[37]  Afra Zomorodian,et al.  Computing Multidimensional Persistence , 2009, J. Comput. Geom..

[38]  Danijela Horak,et al.  Persistent homology of complex networks , 2008, 0811.2203.

[39]  L. Guibas,et al.  Topological methods for exploring low-density states in biomolecular folding pathways. , 2008, The Journal of chemical physics.

[40]  Robert Abel,et al.  Protein side-chain dynamics and residual conformational entropy. , 2009, Journal of the American Chemical Society.

[41]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .

[42]  J. Andrew McCammon,et al.  Absolute Single-Molecule Entropies from Quasi-Harmonic Analysis of Microsecond Molecular Dynamics: Correction Terms and Convergence Properties , 2009, Journal of chemical theory and computation.

[43]  Herbert Edelsbrunner,et al.  Computational Topology - an Introduction , 2009 .

[44]  Herbert Edelsbrunner,et al.  Computing Robustness and Persistence for Images , 2010, IEEE Transactions on Visualization and Computer Graphics.

[45]  A. Joshua Wand,et al.  The role of conformational entropy in molecular recognition by calmodulin , 2010, Nature chemical biology.

[46]  Andrew L. Lee,et al.  Using NMR to study fast dynamics in proteins: methods and applications. , 2010, Current opinion in pharmacology.

[47]  Claudia Landi,et al.  A Mayer–Vietoris Formula for Persistent Homology with an Application to Shape Recognition in the Presence of Occlusions , 2011, Found. Comput. Math..

[48]  Moo K. Chung,et al.  Topology-Based Kernels With Application to Inference Problems in Alzheimer's Disease , 2011, IEEE Transactions on Medical Imaging.

[49]  Stephen Smale,et al.  A Topological View of Unsupervised Learning from Noisy Data , 2011, SIAM J. Comput..

[50]  Valerio Pascucci,et al.  Branching and Circular Features in High Dimensional Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[51]  Tamal K. Dey,et al.  Reeb Graphs: Approximation and Persistence , 2011, SoCG '11.

[52]  M. Gameiro,et al.  Topological Measurement of Protein Compressibility via Persistence Diagrams , 2012 .

[53]  Hubert Mara,et al.  Multivariate Data Analysis Using Persistence-Based Filtering and Topological Signatures , 2012, IEEE Transactions on Visualization and Computer Graphics.

[54]  X. Liu,et al.  A fast algorithm for constructing topological structure in large data , 2012 .

[55]  Bung-Nyun Kim,et al.  Persistent Brain Network Homology From the Perspective of Dendrogram , 2012, IEEE Transactions on Medical Imaging.

[56]  Steve Oudot,et al.  Persistence stability for geometric complexes , 2012, ArXiv.

[57]  Anil Korkut,et al.  Stereochemistry of polypeptide conformation in coarse grained analysis , 2013, 1302.1944.

[58]  Kelin Xia,et al.  Multiscale multiphysics and multidomain models--flexibility and rigidity. , 2013, The Journal of chemical physics.

[59]  M. Ferri,et al.  Betti numbers in multidimensional persistent homology are stable functions , 2013 .

[60]  Andrea Cerri,et al.  The Persistence Space in Multidimensional Persistent Homology , 2013, DGCI.

[61]  Patrizio Frosini,et al.  Persistent Betti numbers for a noise tolerant shape-based approach to image retrieval , 2011, Pattern Recognit. Lett..

[62]  Amit Das,et al.  Conformational contribution to thermodynamics of binding in protein-peptide complexes through microscopic simulation. , 2013, Biophysical journal.

[63]  Konstantin Mischaikow,et al.  Morse Theory for Filtrations and Efficient Computation of Persistent Homology , 2013, Discret. Comput. Geom..

[64]  D. Castle Cannabis and psychosis: what causes what? , 2013, F1000 medicine reports.

[65]  Joël Janin,et al.  Protein flexibility, not disorder, is intrinsic to molecular recognition , 2013, F1000 biology reports.

[66]  Emanuela Merelli,et al.  jHoles: A Tool for Understanding Biological Complex Networks via Clique Weight Rank Persistent Homology , 2014, CS2Bio.

[67]  Ulrich Bauer,et al.  Distributed Computation of Persistent Homology , 2014, ALENEX.

[68]  Gunnar E. Carlsson,et al.  Topological pattern recognition for point cloud data* , 2014, Acta Numerica.

[69]  Kelin Xia,et al.  Persistent homology analysis of protein structure, flexibility, and folding , 2014, International journal for numerical methods in biomedical engineering.

[70]  Kelin Xia,et al.  Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis. , 2014, The Journal of chemical physics.

[71]  Emanuela Merelli,et al.  Characterisation of the Idiotypic Immune Network Through Persistent Entropy , 2014, ECCS.

[72]  Guo-Wei Wei,et al.  Molecular nonlinear dynamics and protein thermal uncertainty quantification. , 2014, Chaos.

[73]  Kelin Xia,et al.  Communication: Capturing protein multiscale thermal fluctuations. , 2015, The Journal of chemical physics.

[74]  Peter Bubenik,et al.  Statistical topological data analysis using persistence landscapes , 2012, J. Mach. Learn. Res..

[75]  M. Gameiro,et al.  A topological measurement of protein compressibility , 2014, Japan Journal of Industrial and Applied Mathematics.

[76]  Guo-Wei Wei,et al.  Multiresolution Topological Simplification , 2015, J. Comput. Biol..

[77]  Kelin Xia,et al.  Persistent topology for cryo‐EM data analysis , 2014, International journal for numerical methods in biomedical engineering.

[78]  Rocío González-Díaz,et al.  An entropy-based persistence barcode , 2015, Pattern Recognit..

[79]  Yiying Tong,et al.  Persistent homology for the quantitative prediction of fullerene stability , 2014, J. Comput. Chem..

[80]  Guo-Wei Wei,et al.  A topological approach for protein classification , 2015, 1510.00953.

[81]  Sunita Yadav,et al.  Thiamine Pyrophosphate Riboswitch in Some Representative Plant Species: A Bioinformatics Study , 2015, J. Comput. Biol..

[82]  P. Biswas,et al.  Conformational Entropy of Intrinsically Disordered Proteins from Amino Acid Triads , 2015, Scientific Reports.

[83]  Peter M. A. Sloot,et al.  Topological Characterization of Complex Systems: Using Persistent Entropy , 2015, Entropy.

[84]  Guo-Wei Wei,et al.  Multiscale Gaussian network model (mGNM) and multiscale anisotropic network model (mANM). , 2015, The Journal of chemical physics.

[85]  Guo-Wei Wei,et al.  Multidimensional persistence in biomolecular data , 2014, J. Comput. Chem..

[86]  Guo-Wei Wei,et al.  Flexibility–rigidity index for protein–nucleic acid flexibility and fluctuation analysis , 2015, J. Comput. Chem..

[87]  Guo-Wei Wei,et al.  Object-oriented persistent homology , 2016, J. Comput. Phys..

[88]  Kelin Xia,et al.  Generalized flexibility-rigidity index. , 2016, The Journal of chemical physics.

[89]  Rocío González-Díaz,et al.  A new topological entropy-based approach for measuring similarities among piecewise linear functions , 2015, Signal Process..

[90]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[91]  R. Ho Algebraic Topology , 2022 .