Monitoring of the Conformational Space of Dipeptides by Generative Topographic Mapping

This work describes a procedure to build generative topographic maps (GTM) as 2D representation of the conformational space (CS) of dipeptides. GTMs with excellent propensities to support highly predictive landscapes of various conformational properties were reported for three dipeptides (AA, KE and KR). CS monitoring via GTMproceeds through the projection of conformer ensembles on the map, producing cumulated responsibility (CR) vectors characteristic of the CS areas covered by the ensemble. Overlap of the CS areas visited by two distinct simulations can be expressed by the Tanimoto coefficient Tc of the associated CRs. This idea was used to monitor the reproducibility of the stochastic evolutionary conformer generation process implemented in S4MPLE. It could be shown that conformers produced by <500 S4MPLE runs reproducibly cover the relevant CS zone at given setup of the driving force field. The propensity of a simulation to visit the native CS zone can thus be quantitatively estimated, as the Tc score with respect to the “native“ CR, as defined by the ensemble of dipeptide geometries extracted from PDB proteins. It could be shown that low‐energy CS regions were indeed found to fall within the native zone. The Tc overlap score behaved as a smooth function of force field parameters. This opens the perspective of a novel force field parameter tuning procedure, bound to simultaneously optimize the behavior of the in Silico simulations for every possible dipeptide.

[1]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[2]  Dragos Horvath,et al.  In Silico Fragment-Based Drug Discovery: Setup and Validation of a Fragment-to-Lead Computational Protocol Using S4MPLE , 2013, J. Chem. Inf. Model..

[3]  J. Ménissier-de murcia,et al.  XRCC1 is phosphorylated by DNA-dependent protein kinase in response to DNA damage , 2006, Nucleic acids research.

[4]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[5]  A. Tversky Features of Similarity , 1977 .

[6]  Nicolas Foloppe,et al.  Conformational sampling and energetics of drug-like molecules. , 2009, Current medicinal chemistry.

[7]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[8]  Dragos Horvath,et al.  Chemical Data Visualization and Analysis with Incremental Generative Topographic Mapping: Big Data Challenge , 2015, J. Chem. Inf. Model..

[9]  M. Sippl Recognition of errors in three‐dimensional structures of proteins , 1993, Proteins.

[10]  Dragos Horvath,et al.  Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds , 2015, Journal of Computer-Aided Molecular Design.

[11]  Christopher M. Bishop,et al.  Developments of the generative topographic mapping , 1998, Neurocomputing.

[12]  Junmei Wang,et al.  Development and testing of a general amber force field , 2004, J. Comput. Chem..

[13]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[14]  Lothar Schäfer,et al.  Evaluation of the dipeptide approximation in peptide modeling by ab initio geometry optimizations of oligopeptides , 1993 .

[15]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[16]  Vijay S Pande,et al.  Trp zipper folding kinetics by molecular dynamics and temperature-jump spectroscopy , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Alban Arrault,et al.  Generative Topographic Mapping-Based Classification Models and Their Applicability Domain: Application to the Biopharmaceutics Drug Disposition Classification System (BDDCS) , 2013, J. Chem. Inf. Model..

[18]  Garegin A Papoian,et al.  Functional versus folding landscapes: the same yet different. , 2010, Current opinion in structural biology.

[19]  Dragos Horvath,et al.  S4MPLE - Sampler For Multiple Protein-Ligand Entities: Simultaneous Docking of Several Entities , 2013, J. Chem. Inf. Model..

[20]  Dragos Horvath,et al.  Generative Topographic Mapping of Conformational Space , 2017, Molecular informatics.

[21]  Héléna A. Gaspar,et al.  Generative Topographic Mapping (GTM): Universal Tool for Data Visualization, Structure‐Activity Modeling and Dataset Comparison , 2012, Molecular informatics.

[22]  Wei Yang,et al.  Generalized essential energy space random walks to more effectively accelerate solute sampling in aqueous environment. , 2012, The Journal of chemical physics.

[23]  Dragos Horvath,et al.  S4MPLE—Sampler for Multiple Protein-Ligand Entities: Methodology and Rigid-Site Docking Benchmarking , 2015, Molecules.

[24]  Xin Chen,et al.  Asymmetry of Chemical Similarity , 2007, ChemMedChem.

[25]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[26]  Andrew C. R. Martin,et al.  BiopLib and BiopTools—a C programming library and toolset for manipulating protein structure , 2015, Bioinform..