Molecular free energies, rates, and mechanisms from data-efficient path sampling simulations

Molecular dynamics is a powerful tool for studying the thermodynamics and kinetics of complex molecular events. However, these simulations can rarely sample the required time scales in practice. Transition path sampling overcomes this limitation by collecting unbiased trajectories capturing the relevant events. Moreover, the integration of machine learning can boost the sampling while simultaneously learning a quantitative representation of the mechanism. Still, the resulting trajectories are by construction non-Boltzmann-distributed, preventing the calculation of free energies and rates. We developed an algorithm to approximate the equilibrium path ensemble from machine learning-guided path sampling data. At the same time, our algorithm provides efficient sampling, the mechanism, free energy, and rates of rare molecular events at a very moderate computational cost. We tested the method on the folding of the mini-protein chignolin. Our algorithm is straightforward and data-efficient, opening the door to applications on many challenging molecular systems.