Detecting intermediate protein conformations using algebraic topology

BackgroundUnderstanding protein structure and dynamics is essential for understanding their function. This is a challenging task due to the high complexity of the conformational landscapes of proteins and their rugged energy levels. In particular, it is important to detect highly populated regions which could correspond to intermediate structures or local minima.ResultsWe present a hierarchical clustering and algebraic topology based method that detects regions of interest in protein conformational space. The method is based on several techniques. We use coarse grained protein conformational search, efficient robust dimensionality reduction and topological analysis via persistent homology as the main tools. We use two dimensionality reduction methods as well, robust Principal Component Analysis (PCA) and Isomap, to generate a reduced representation of the data while preserving most of the variance in the data.ConclusionsOur hierarchical clustering method was able to produce compact, well separated clusters for all the tested examples.

[1]  Jilong Li,et al.  Designing and benchmarking the MULTICOM protein structure prediction system , 2013, BMC Structural Biology.

[2]  T. Siméon,et al.  Modeling protein conformational transitions by a combination of coarse-grained normal mode analysis and robotics-inspired methods , 2013, BMC Structural Biology.

[3]  Michael Levitt,et al.  Combining efficient conformational sampling with a deformable elastic network model facilitates structure refinement at low resolution. , 2007, Structure.

[4]  L. Kavraki,et al.  Tracing conformational changes in proteins , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop.

[5]  R. Jernigan,et al.  The energy profiles of atomic conformational transition intermediates of adenylate kinase , 2009, Proteins.

[6]  Gunnar E. Carlsson,et al.  Topological estimation using witness complexes , 2004, PBG.

[7]  Afra Zomorodian,et al.  Computing Persistent Homology , 2004, SCG '04.

[8]  Dahlia R. Weiss,et al.  Can morphing methods predict intermediate structures? , 2009, Journal of molecular biology.

[9]  Vijay S. Pande,et al.  Persistent Topology and Metastable State in Conformational Dynamics , 2013, PloS one.

[10]  Rafael Najmanovich,et al.  ENCoM server: exploring protein conformational space and the effect of mutations on protein function and stability , 2015, Nucleic Acids Res..

[11]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[12]  J. Onuchic,et al.  Funnels, pathways, and the energy landscape of protein folding: A synthesis , 1994, Proteins.

[13]  Ora Schueler-Furman,et al.  Rapid Sampling of Molecular Motions with Prior Information Constraints , 2009, PLoS Comput. Biol..

[14]  Wenjun Zheng,et al.  Identification of dynamical correlations within the myosin motor domain by the normal mode analysis of an elastic network model. , 2005, Journal of molecular biology.

[15]  M. Lamers,et al.  A structural role for the PHP domain in E. coli DNA polymerase III , 2013, BMC Structural Biology.

[16]  Dan A. Simovici,et al.  Characterizing Intermediate Conformations in Protein Conformational Space , 2012, CIBB.

[17]  Amarda Shehu,et al.  Guiding the Search for Native-like Protein Conformations with an Ab-initio Tree-based Exploration , 2010, Int. J. Robotics Res..

[18]  Nurit Haspel,et al.  Multi-Resolution Rigidity-Based Sampling of Protein Conformational Paths , 2013, BCB.

[19]  Leonidas J. Guibas,et al.  Persistence-Based Clustering in Riemannian Manifolds , 2013, JACM.

[20]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[21]  Eng‐Hui Yap,et al.  A coarse‐grained α‐carbon protein model with anisotropic hydrogen‐bonding , 2008 .

[22]  Osamu Miyashita,et al.  Simple energy landscape model for the kinetics of functional transitions in proteins. , 2005, The journal of physical chemistry. B.

[23]  J. Tropp,et al.  Two proposals for robust PCA using semidefinite programming , 2010, 1012.1086.

[24]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .

[25]  Mark A. Wilson,et al.  Intrinsic motions along an enzymatic reaction trajectory , 2007, Nature.

[26]  Amarda Shehu,et al.  Elucidating the ensemble of functionally-relevant transitions in protein systems with a robotics-inspired method , 2013, BMC Structural Biology.

[27]  Guang Song,et al.  Protein elastic network models and the ranges of cooperativity , 2009, Proceedings of the National Academy of Sciences.

[28]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[29]  Mikael Vejdemo-Johansson,et al.  javaPlex: A Research Software Package for Persistent (Co)Homology , 2014, ICMS.

[30]  Vin de Silva,et al.  On the Local Behavior of Spaces of Natural Images , 2007, International Journal of Computer Vision.

[31]  R. Ho Algebraic Topology , 2022 .

[32]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[33]  Teresa Head-Gordon,et al.  A coarse-grained alpha-carbon protein model with anisotropic hydrogen-bonding. , 2008, Proteins.

[34]  Holger Gohlke,et al.  The Amber biomolecular simulation programs , 2005, J. Comput. Chem..