Universal and efficient entropy estimation using a compression algorithm

Entropy and free-energy estimation are key in thermodynamic characterization of simulated systems ranging from spin models through polymer physics, protein structure, and drug-design. Current techniques suffer from being model specific, requiring abundant computation resources and simulation at conditions far from the studied realization. Here, we present a novel universal scheme to calculate entropy using lossless compression algorithms and validate it on simulated systems of increasing complexity. Our results show accurate entropy values compared to benchmark calculations while being computationally effective. In molecular-dynamics simulations of protein folding, we exhibit unmatched detection capability of the folded states by measuring previously undetectable entropy fluctuations along the simulation timeline. Such entropy evaluation opens a new window onto the dynamics of complex systems and allows efficient free-energy calculations.

[1]  L. Pezard,et al.  Entropy estimation of very short symbolic sequences. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Wolff,et al.  Collective Monte Carlo updating for spin systems. , 1989, Physical review letters.

[3]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[4]  K. Binder,et al.  A Guide to Monte Carlo Simulations in Statistical Physics , 2000 .

[5]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[6]  G. Hummer,et al.  Coarse master equations for peptide folding dynamics. , 2008, The journal of physical chemistry. B.

[7]  E. Vogel,et al.  Data compressor designed to improve recognition of magnetic phases , 2012 .

[8]  P. Debenedetti,et al.  Metastable liquid–liquid transition in a molecular model of water , 2014, Nature.

[9]  V. Hilser,et al.  The ensemble nature of allostery , 2014, Nature.

[10]  D. Frenkel,et al.  From Algorithms to Applications , 2007 .

[11]  D. Kofke Free energy methods in molecular simulation , 2005 .

[12]  L. Young Entropy in dynamical systems , 2003 .

[13]  L. Onsager Crystal statistics. I. A two-dimensional model with an order-disorder transition , 1944 .

[14]  J. Doye,et al.  Computing phase diagrams for a quasicrystal-forming patchy-particle system. , 2013, Physical review letters.

[15]  Gerhard Hummer,et al.  Native contacts determine protein folding mechanisms in atomistic simulations , 2013, Proceedings of the National Academy of Sciences.

[16]  J. Kemeny Two Measures of Complexity , 1955 .

[17]  J. P. Grossman,et al.  Biomolecular simulation: a computational microscope for molecular biology. , 2012, Annual review of biophysics.

[18]  G. Wannier,et al.  Antiferromagnetism. The Triangular Ising Net , 1950 .

[19]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[20]  Daan Frenkel,et al.  New Monte Carlo method to compute the free energy of arbitrary solids. Application to the fcc and hcp phases of hard spheres , 1984 .

[21]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[22]  Regine Herbst-Irmer,et al.  High-resolution x-ray crystal structures of the villin headpiece subdomain, an ultrafast folding protein. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[23]  A. Hartmann,et al.  Analysis of the phase transition in the two-dimensional Ising ferromagnet using a Lempel-Ziv string-parsing scheme and black-box data-compression utilities. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Kresten Lindorff-Larsen,et al.  Protein folding kinetics and thermodynamics from atomistic simulation , 2012, Proceedings of the National Academy of Sciences.

[25]  Dov Levine,et al.  Quantifying Hidden Order out of Equilibrium , 2017, Physical Review X.

[26]  Christos Faloutsos,et al.  Analysis of the Clustering Properties of the Hilbert Space-Filling Curve , 2001, IEEE Trans. Knowl. Data Eng..

[27]  Wolfgang Krieger,et al.  On entropy and generators of measure-preserving transformations , 1970 .

[28]  A. Schlijper,et al.  Two-sided bounds on the free energy from local states in Monte Carlo simulations , 1989 .