PathMolD-AB: Spatiotemporal pathways of protein folding using parallel molecular dynamics with a coarse-grained model

Solving the protein folding problem (PFP) is one of the grand challenges still open in computational biophysics. Globular proteins are believed to evolve from initial configurations through folding pathways connecting several thermodynamically accessible states in a free energy landscape until reaching its minimum, inhabited by the stable native structures. Despite its huge computational burden, molecular dynamics (MD) is the leading approach in the PFP studies by preserving the Newtonian temporal evolution in the canonical ensemble. Non-trivial improvements are provided by highly parallel implementations of MD in cost-effective GPUs, concomitant to multiscale descriptions of proteins by coarse-grained minimalist models. In this vein, we present the PathMolD-AB framework, a comprehensive software package for massively parallel MD simulations using the canonical ensemble, structural analysis, and visualization of the folding pathways using the minimalist AB-model. It has, also, a tool to compare the results with proteins re-scaled from the PDB. We simulate and analyze, as case studies, the folding of four proteins: 13FIBO, 2GB1, 1PLC and 5ANZ, with 13, 55, 99 and 223 amino acids, respectively. The datasets generated from simulations correspond to the MD evolution of 3500 folding pathways, encompassing 35×106 states, which contains the spatial amino acid positions, the protein free energies and radii of gyration at each time step. Results indicate that the speedup of our approach grows logarithmically with the protein length and, therefore, it is suited for most of the proteins in the PDB. The predicted structures simulated by PathMolD-AB were similar to the re-scaled biological structures, indicating that it is promising for the study of the PFP study.

[1]  H. Berendsen,et al.  Molecular dynamics with coupling to an external bath , 1984 .

[2]  Heitor Silvério Lopes Evolutionary Algorithms for the Protein Folding Problem: A Review and Current Trends , 2008, Computational Intelligence in Biomedicine and Bioinformatics.

[3]  Samuel Karlin,et al.  Protein length in eukaryotic and prokaryotic proteomes , 2005, Nucleic acids research.

[4]  Matheus Gutoski,et al.  A benchmark of optimally folded protein structures using integer programming and the 3D-HP-SC model , 2019, Comput. Biol. Chem..

[5]  Valentina Tozzini,et al.  Multiscale modeling of proteins. , 2010, Accounts of chemical research.

[6]  Scott Brown,et al.  Coarse-grained sequences for protein folding and design , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  G. Torrie,et al.  Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling , 1977 .

[8]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[9]  K. Dill,et al.  The Protein-Folding Problem, 50 Years On , 2012, Science.

[10]  Charles L. Brooks,et al.  Insights from Coarse-Grained Gō Models for Protein Folding and Dynamics , 2009, International journal of molecular sciences.

[11]  Jan Stourac,et al.  CAVER Analyst 2.0: analysis and visualization of channels and tunnels in protein structures and molecular dynamics trajectories , 2018, Bioinform..

[12]  Rafael B. Frigori,et al.  PHAST: Protein-like heteropolymer analysis by statistical thermodynamics , 2017, Comput. Phys. Commun..

[13]  B. Alder,et al.  Studies in Molecular Dynamics. I. General Method , 1959 .

[14]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[15]  M. Levitt,et al.  Computer simulation of protein folding , 1975, Nature.

[16]  Nicola Yanev,et al.  Integer Programming Approach to HP Folding , 2012, Serdica Journal of Computing.

[17]  A. Kolinski,et al.  Coarse-Grained Protein Models and Their Applications. , 2016, Chemical reviews.

[18]  Hanspeter Pfister,et al.  ProteomeVis: a web app for exploration of protein properties from structure to sequence evolution across organisms’ proteomes , 2017, bioRxiv.

[19]  Marcus Müller,et al.  Multi-architecture Monte-Carlo (MC) simulation of soft coarse-grained polymeric materials: SOft coarse grained Monte-Carlo Acceleration (SOMA) , 2017, Comput. Phys. Commun..

[20]  Antonio Turi,et al.  Distance-dependent hydrophobic-hydrophobic contacts in protein folding simulations. , 2014, Physical chemistry chemical physics : PCCP.

[21]  Jan Brezovsky,et al.  Engineering a de Novo Transport Tunnel , 2016 .

[22]  Antonio Turi,et al.  Lattices for ab initio protein structure prediction , 2008, Proteins.

[23]  Kuldip K. Paliwal,et al.  Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks , 2018, Bioinform..

[24]  Frank Potthast,et al.  Local interactions and protein folding: A three-dimensional off-lattice approach , 1997 .

[25]  Juan Lin,et al.  Multi-agent simulated annealing algorithm with parallel adaptive multiple sampling for protein structure prediction in AB off-lattice model , 2018, Appl. Soft Comput..

[26]  Jennifer M. Hays,et al.  gmxapi: a high-level interface for advanced control and extension of molecular dynamics simulations , 2018, bioRxiv.

[27]  Stevenn Volant,et al.  MEMHDX: an interactive tool to expedite the statistical validation and visualization of large HDX-MS datasets , 2016, Bioinform..

[28]  Kurt S. Anderson,et al.  An improved fast multipole method for electrostatic potential calculations in a class of coarse-grained molecular simulations , 2014, J. Comput. Phys..

[29]  Metaxia Vlassi,et al.  Insights on the alteration of functionality of a tyrosine kinase 2 variant: a molecular dynamics study , 2018, Bioinform..

[30]  M. Bachmann,et al.  Thermodynamics and Statistical Mechanics of Macromolecular Systems , 2014 .

[31]  Chen Keasar,et al.  Enhancement of beta-sheet assembly by cooperative hydrogen bonds potential , 2009, Bioinform..

[32]  Toshio Nakaki,et al.  ATP depletion does not account for apoptosis induced by inhibition of mitochondrial electron transport chain in human dopaminergic cells , 2007, Neuropharmacology.

[33]  Yang Cao,et al.  NCACO-score: An effective main-chain dependent scoring function for structure modeling , 2011, BMC Bioinformatics.

[34]  K. Dill,et al.  Origins of structure in globular proteins. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[35]  R. Norel,et al.  Electrostatic aspects of protein-protein interactions. , 2000, Current opinion in structural biology.

[36]  Head-Gordon,et al.  Collective aspects of protein folding illustrated by a toy model. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[37]  Jooyoung Lee,et al.  PFDB: A standardized protein folding database with temperature correction , 2019, Scientific Reports.

[38]  Heitor Silvério Lopes,et al.  Molecular Dynamics for Simulating the Protein Folding Process Using the 3D AB Off-Lattice Model , 2012, BSB.

[39]  Bin Wang,et al.  An improved stochastic fractal search algorithm for 3D protein structure prediction , 2018, Journal of Molecular Modeling.

[40]  Andrzej Kolinski,et al.  Lattice polymers and protein models , 2011 .

[41]  Rafael B Frigori,et al.  Breakout character of islet amyloid polypeptide hydrophobic mutations at the onset of type-2 diabetes. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Joe Marks,et al.  Computational Complexity, Protein Structure Prediction, and the Levinthal Paradox , 1994 .

[43]  Francisco Arcas-Túnez,et al.  Soft Computing Techniques for the Protein Folding Problem on High Performance Computing Architectures. , 2016, Current drug targets.

[44]  N. Alves,et al.  Inferring a weighted elastic network from partial unfolding with coarse‐grained simulations , 2014, Proteins.

[45]  Giorgio Colombo,et al.  Protein Folding Simulations: Combining Coarse-grained Models and All-atom Molecular Dynamics , 2006 .

[46]  N. Alves,et al.  Microcanonical thermostatistics of coarse-grained proteins with amyloidogenic propensity. , 2012, The Journal of chemical physics.

[47]  Sharon C. Glotzer,et al.  Pseudo-random number generation for Brownian Dynamics and Dissipative Particle Dynamics simulations on GPU devices , 2011, J. Comput. Phys..

[48]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[49]  Paulino Pérez-Rodríguez,et al.  Mathematical modeling and comparison of protein size distribution in different plant, animal, fungal and microbial species reveals a negative correlation between protein size and protein number, thus providing insight into the evolution of proteomes , 2012, BMC Research Notes.

[50]  Stavros Makrodimitris,et al.  Improving protein function prediction using protein sequence and GO-term similarities , 2018, Bioinform..

[51]  Sharon C. Glotzer,et al.  GPU accelerated Discrete Element Method (DEM) molecular dynamics for conservative, faceted particle simulations , 2016, J. Comput. Phys..

[52]  M Cieplak,et al.  Scaling of Folding Properties in Go Models of Proteins , 2000, Journal of biological physics.

[53]  Ross C. Walker,et al.  An overview of the Amber biomolecular simulation package , 2013 .

[54]  A. Finkelstein,et al.  50+ Years of Protein Folding , 2018, Biochemistry (Moscow).

[55]  Yasuhiro Matsunaga,et al.  GENESIS 1.1: A hybrid‐parallel molecular dynamics simulator with enhanced sampling algorithms on multiple computational platforms , 2017, J. Comput. Chem..

[56]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[57]  H. Stanley,et al.  Statistical physics of macromolecules , 1995 .

[58]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[59]  Alexandre M J J Bonvin,et al.  iSEE: Interface structure, evolution, and energy‐based machine learning predictor of binding affinity changes upon mutations , 2019, Proteins.

[60]  Birgit Schiøtt,et al.  Investigating C99 in Amyloid Formation using Molecular Dynamics: From Simple to Complex Neuronal Models , 2019, Biophysical Journal.

[61]  Feng Zhang,et al.  Implementation of metal-friendly EAM/FS-type semi-empirical potentials in HOOMD-blue: A GPU-accelerated molecular dynamics software , 2018, J. Comput. Phys..

[62]  Matheus Gutoski,et al.  A Novel Approach to Protein Folding Prediction based on Long Short-Term Memory Networks: A Preliminary Investigation and Analysis , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[63]  Nir Kalisman,et al.  Differentiable, multi‐dimensional, knowledge‐based energy terms for torsion angle probabilities and propensities , 2008, Proteins.

[64]  D. Lewis,et al.  Disease-specific changes in regulator of G-protein signaling 4 (RGS4) expression in schizophrenia , 2001, Molecular Psychiatry.

[65]  Adam K. Sieradzan,et al.  Lessons from application of the UNRES force field to predictions of structures of CASP10 targets , 2013, Proceedings of the National Academy of Sciences.

[66]  Alberto Marina,et al.  Structures of collagen IV globular domains: insight into associated pathologies, folding and network assembly , 2018, IUCrJ.