Comparing geometric and kinetic cluster algorithms for molecular simulation data.

The identification of metastable states of a molecule plays an important role in the interpretation of molecular simulation data because the free-energy surface, the relative populations in this landscape, and ultimately also the dynamics of the molecule under study can be described in terms of these states. We compare the results of three different geometric cluster algorithms (neighbor algorithm, K-medoids algorithm, and common-nearest-neighbor algorithm) among each other and to the results of a kinetic cluster algorithm. First, we demonstrate the characteristics of each of the geometric cluster algorithms using five two-dimensional data sets. Second, we analyze the molecular dynamics data of a beta-heptapeptide in methanol--a molecule that exhibits a distinct folded state, a structurally diverse unfolded state, and a fast folding/unfolding equilibrium--using both geometric and kinetic cluster algorithms. We find that geometric clustering strongly depends on the algorithm used and that the density based common-nearest-neighbor algorithm is the most robust of the three geometric cluster algorithms with respect to variations in the input parameters and the distance metric. When comparing the geometric cluster results to the metastable states of the beta-heptapeptide as identified by kinetic clustering, we find that in most cases the folded state is identified correctly but the overlap of geometric clusters with further metastable states is often at best approximate.

[1]  Yan Li Bayesian Model Based Clustering Analysis: Application to a Molecular Dynamics Trajectory of the HIV-1 Integrase Catalytic Core , 2006, J. Chem. Inf. Model..

[2]  V. Pande,et al.  Foldamer dynamics expressed via Markov state models. II. State space decomposition. , 2005, The Journal of chemical physics.

[3]  William Swope,et al.  Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 1. Theory , 2004 .

[4]  Jianyin Shao,et al.  Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms. , 2007, Journal of chemical theory and computation.

[5]  Amiram Goldblum,et al.  The "Nearest Single Neighbor" Method-Finding Families of Conformations within a Sample , 2003, J. Chem. Inf. Comput. Sci..

[6]  Ray A. Jarvis,et al.  Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[7]  R. Fournier Reviews in Computational Chemistry. Volume 18 Edited by Kenny B. Lipkowitz and Donald B. Boyd (Indiana University-Purdue University). Wiley-VCH: Hoboken. 2002. xxxii + 350 pp. $150.00. ISBN 0-471-21576-7. , 2003 .

[8]  G. Hummer,et al.  Coarse master equations for peptide folding dynamics. , 2008, The journal of physical chemistry. B.

[9]  P. Deuflhard,et al.  Identification of almost invariant aggregates in reversible nearly uncoupled Markov chains , 2000 .

[10]  G. Ciccotti,et al.  Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes , 1977 .

[11]  A. Caflisch,et al.  Kinetic analysis of molecular dynamics simulations reveals changes in the denatured state and switch of folding pathways upon single‐point mutation of a β‐sheet miniprotein , 2008, Proteins.

[12]  Dmitry Nerukh,et al.  Sensitivity of peptide conformational dynamics on clustering of a classical molecular dynamics trajectory. , 2008, The Journal of chemical physics.

[13]  Wilfred F van Gunsteren,et al.  Estimating the temperature dependence of peptide folding entropies and free enthalpies from total energies in molecular dynamics simulations. , 2008, Chemistry.

[14]  Andrew G. Glen,et al.  APPL , 2001 .

[15]  John D. Chodera,et al.  Long-Time Protein Folding Dynamics from Short-Time Molecular Dynamics Simulations , 2006, Multiscale Model. Simul..

[16]  X. Daura,et al.  Folding–unfolding thermodynamics of a β‐heptapeptide from equilibrium simulations , 1999, Proteins.

[17]  H. Berendsen,et al.  Molecular dynamics with coupling to an external bath , 1984 .

[18]  K. Dill,et al.  Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics. , 2007, The Journal of chemical physics.

[19]  J. Banavar,et al.  Computer Simulation of Liquids , 1988 .

[20]  X. Daura,et al.  Reversible peptide folding in solution by molecular dynamics simulation. , 1998, Journal of molecular biology.

[21]  Wilfred F. van Gunsteren,et al.  Do valine side chains have an influence on the folding behavior of β-substituted β-peptides? , 2004 .

[22]  W. V. van Gunsteren,et al.  Protein under pressure: Molecular dynamics simulation of the arc repressor , 2006, Proteins.

[23]  Jeremy C. Smith,et al.  Hierarchical analysis of conformational dynamics in biomolecules: transition networks of metastable states. , 2007, The Journal of chemical physics.

[24]  William Swope,et al.  Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 2. Example Applications to Alanine Dipeptide and a β-Hairpin Peptide† , 2004 .

[25]  Peter S. Shenkin,et al.  Cluster analysis of molecular conformations , 1994, J. Comput. Chem..

[26]  O M Becker,et al.  Geometric versus topological clustering: An insight into conformation mapping , 1997, Proteins.

[27]  Wilhelm Huisinga,et al.  From simulation data to conformational ensembles: Structure and dynamics‐based methods , 1999 .