MonteGrappa: An iterative Monte Carlo program to optimize biomolecular potentials in simplified models

Abstract Simplified models, including implicit-solvent and coarse-grained models, are useful tools to investigate the physical properties of biological macromolecules of large size, like protein complexes, large DNA/RNA strands and chromatin fibres. While advanced Monte Carlo techniques are quite efficient in sampling the conformational space of such models, the availability of realistic potentials is still a limitation to their general applicability. The recent development of a computational scheme capable of designing potentials to reproduce any kind of experimental data that can be expressed as thermal averages of conformational properties of the system has partially alleviated the problem. Here we present a program that implements the optimization of the potential with respect to the experimental data through an iterative Monte Carlo algorithm and a rescaling of the probability of the sampled conformations. The Monte Carlo sampling includes several types of moves, suitable for different kinds of system, and various sampling schemes, such as fixed-temperature, replica-exchange and adaptive simulated tempering. The conformational properties whose thermal averages are used as inputs currently include contact functions, distances and functions of distances, but can be easily extended to any function of the coordinates of the system. Program summary Program title: MonteGrappa Catalogue identifier: AEUO_v1_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEUO_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 139,987 No. of bytes in distributed program, including test data, etc.: 1,889,541 Distribution format: tar.gz Programming language: C. Computer: Any computer with C compilers. Operating system: Linux, Unix, OSX. RAM: Bytes depend on the size of the system, typically 4 GB Classification: 3, 16.1. External routines: gsl, MPI (optional) Nature of problem: Optimize an interaction potential for coarse-grained models of biopolymers based on experimental data expressed as averages of conformational properties. Solution method: Iterative Monte Carlo sampling coupled with minimization of the chi2 between experimental and back-calculated data making use of a reweighting algorithm. Running time: Hours to days, depending on the complexity of the problem.

[1]  Vincent A. Voelz,et al.  Atomistic folding simulations of the five-helix bundle protein λ(6−85). , 2011, Journal of the American Chemical Society.

[2]  Kresten Lindorff-Larsen,et al.  Experimental parameterization of an energy function for the simulation of unfolded proteins. , 2008, Biophysical journal.

[3]  K. Hukushima,et al.  Exchange Monte Carlo Method and Application to Spin Glass Simulations , 1995, cond-mat/9512035.

[4]  J. Dekker,et al.  Predictive Polymer Modeling Reveals Coupled Fluctuations in Chromosome Conformation and Transcription , 2014, Cell.

[5]  E. Shakhnovich,et al.  The folding thermodynamics and kinetics of crambin using an all-atom Monte Carlo simulation. , 2000, Journal of molecular biology.

[6]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[7]  Kresten Lindorff-Larsen,et al.  Atomistic description of the folding of a dimeric protein. , 2013, The journal of physical chemistry. B.

[8]  N. Go Theoretical studies of protein folding. , 1983, Annual review of biophysics and bioengineering.

[9]  Guido Tiana,et al.  Iterative derivation of effective potentials to sample the conformational space of proteins at atomistic scale. , 2014, The Journal of chemical physics.

[10]  Yuko Okamoto,et al.  Prediction of peptide conformation by multicanonical algorithm: New approach to the multiple‐minima problem , 1993, J. Comput. Chem..

[11]  G. Parisi,et al.  Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.

[12]  Alan M. Ferrenberg,et al.  Optimized Monte Carlo data analysis. , 1989, Physical Review Letters.

[13]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[14]  Gregory A Voth,et al.  A linear-scaling self-consistent generalization of the multistate empirical valence bond method for multiple excess protons in aqueous systems. , 2005, The Journal of chemical physics.

[15]  Paul Robustelli,et al.  Characterization of the conformational equilibrium between the two major substates of RNase A using NMR chemical shifts. , 2012, Journal of the American Chemical Society.

[16]  Oliver F. Lange,et al.  Consistent blind protein structure generation from NMR chemical shift data , 2008, Proceedings of the National Academy of Sciences.

[17]  W. D. Laat,et al.  A Decade of 3c Technologies: Insights into Nuclear Organization References , 2022 .

[18]  G. Torrie,et al.  Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling , 1977 .

[19]  G Tiana,et al.  Equilibrium properties of realistic random heteropolymers and their relevance for globular and naturally unfolded proteins. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.