Dynamic clustering threshold reduces conformer ensemble size while maintaining a biologically relevant ensemble

Representing the 3D structures of ligands in virtual screenings via multi-conformer ensembles can be computationally intensive, especially for compounds with a large number of rotatable bonds. Thus, reducing the size of multi-conformer databases and the number of query conformers, while simultaneously reproducing the bioactive conformer with good accuracy, is of crucial interest. While clustering and RMSD filtering methods are employed in existing conformer generators, the novelty of this work is the inclusion of a clustering scheme (NMRCLUST) that does not require a user-defined cut-off value. This algorithm simultaneously optimizes the number and the average spread of the clusters. Here we describe and test four inter-dependent approaches for selecting computer-generated conformers, namely: OMEGA, NMRCLUST, RMS filtering and averaged-RMS filtering. The bioactive conformations of 65 selected ligands were extracted from the corresponding protein:ligand complexes from the Protein Data Bank, including eight ligands that adopted dissimilar bound conformations within different receptors. We show that NMRCLUST can be employed to further filter OMEGA-generated conformers while maintaining biological relevance of the ensemble. It was observed that NMRCLUST (containing on average 10 times fewer conformers per compound) performed nearly as well as OMEGA, and both outperformed RMS filtering and averaged-RMS filtering in terms of identifying the bioactive conformations with excellent and good matches (0.5 < RMSD < 1.0 Å). Furthermore, we propose thresholds for OMEGA root-mean square filtering depending on the number of rotors in a compound: 0.8, 1.0 and 1.4 for structures with low (1–4), medium (5–9) and high (10–15) numbers of rotatable bonds, respectively. The protocol employed is general and can be applied to reduce the number of conformers in multi-conformer compound collections and alleviate the complexity of downstream data processing in virtual screening experiments.

[1]  Brian K. Shoichet,et al.  Virtual Screening in Drug Discovery , 2005 .

[2]  L. Kelley,et al.  An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies. , 1996, Protein engineering.

[3]  P. Charifson,et al.  Conformational analysis of drug-like molecules bound to proteins: an extensive study of ligand reorganization upon binding. , 2004, Journal of medicinal chemistry.

[4]  Keith T. Butler,et al.  Toward accurate relative energy predictions of the bioactive conformation of drugs , 2009, J. Comput. Chem..

[5]  Jürgen Bajorath,et al.  Distinguishing between Bioactive and Modeled Compound Conformations through Mining of Emerging Chemical Patterns , 2008, J. Chem. Inf. Model..

[6]  Jiabo Li,et al.  CAESAR: A New Conformer Generation Algorithm Based on Recursive Buildup and Local Rotational Symmetry Consideration , 2007, J. Chem. Inf. Model..

[7]  Andrew Smellie,et al.  Poling: Promoting conformational variation , 1995, J. Comput. Chem..

[8]  Christian Senger,et al.  Representation of target-bound drugs by computed conformers: implications for conformational libraries , 2006, BMC Bioinformatics.

[9]  Thierry Langer,et al.  Comparative Analysis of Protein-Bound Ligand Conformations with Respect to Catalyst's Conformational Space Subsampling Algorithms , 2005, J. Chem. Inf. Model..

[10]  Kenneth M. Merz,et al.  Can we separate active from inactive conformations? , 2002, J. Comput. Aided Mol. Des..

[11]  Omar Haq,et al.  Torsion Angle Preference and Energetics of Small-Molecule Ligands Bound to Proteins , 2007, J. Chem. Inf. Model..

[12]  P. Hawkins,et al.  Comparison of shape-matching and docking as virtual screening tools. , 2007, Journal of medicinal chemistry.

[13]  John H. Van Drie,et al.  Pharmacophore Discovery - Lessons Learned , 2003 .

[14]  Simona Distinto,et al.  How To Optimize Shape-Based Virtual Screening: Choosing the Right Query and Including Chemical Information , 2009, J. Chem. Inf. Model..

[15]  Dimitris K. Agrafiotis,et al.  A distance geometry heuristic for expanding the range of geometries sampled during conformational search , 2006, J. Comput. Chem..

[16]  Marvin Johnson,et al.  Concepts and applications of molecular similarity , 1990 .

[17]  Jonas Boström,et al.  Conformational energy penalties of protein-bound ligands , 1998, J. Comput. Aided Mol. Des..

[18]  Brian B. Masek,et al.  A knowledge-based approach to generating diverse but energetically representative ensembles of ligand conformers , 2008, J. Comput. Aided Mol. Des..

[19]  Carsten Kutzner,et al.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. , 2008, Journal of chemical theory and computation.

[20]  Nicolas Foloppe,et al.  Conformational Sampling of Druglike Molecules with MOE and Catalyst: Implications for Pharmacophore Modeling and Virtual Screening , 2008, J. Chem. Inf. Model..

[21]  Clemencia Pinilla,et al.  Conformation-opioid activity relationships of bicyclic guanidines from 3D similarity analysis. , 2008, Bioorganic & medicinal chemistry.

[22]  Jordi Mestres,et al.  A molecular-field-based similarity study of non-nucleoside HIV-1 reverse transcriptase inhibitors. 2. The relationship between alignment solutions obtained from conformationally rigid and flexible matching , 2000, J. Comput. Aided Mol. Des..

[23]  David E. Shaw,et al.  PHASE: a new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results , 2006, J. Comput. Aided Mol. Des..

[24]  H. Kubinyi QSAR and 3D QSAR in drug design Part 1: methodology , 1997 .

[25]  Huafeng Xu,et al.  A self-organizing principle for learning nonlinear manifolds , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Ovanes Mekenyan,et al.  Conformational Coverage by a Genetic Algorithm: Saturation of Conformational Space , 2007, J. Chem. Inf. Model..

[27]  Weida Tong,et al.  Structure‐activity relationship approaches and applications , 2003, Environmental toxicology and chemistry.

[28]  Julian Tirado-Rives,et al.  Contribution of conformer focusing to the uncertainty in predicting free energies for protein-ligand binding. , 2006, Journal of medicinal chemistry.

[29]  C L Brooks,et al.  Do active site conformations of small ligands correspond to low free-energy solution structures? , 1998, Journal of computer-aided molecular design.

[30]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[31]  Evan Bolton,et al.  Assessment of Conformational Ensemble Sizes Necessary for Specific Resolutions of Coverage of Conformational Space , 2007, J. Chem. Inf. Model..

[32]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[33]  M C Nicklaus,et al.  Conformational changes of small molecules binding to proteins. , 1995, Bioorganic & medicinal chemistry.

[34]  G. S. Gill,et al.  Molecular surface point environments for virtual screening and the elucidation of binding patterns (MOLPRINT) , 2004 .

[35]  BMC Bioinformatics , 2005 .

[36]  Thierry Langer,et al.  Comparative Performance Assessment of the Conformational Model Generators Omega and Catalyst: A Large-Scale Survey on the Retrieval of Protein-Bound Ligand Conformations , 2006, J. Chem. Inf. Model..

[37]  Tommy Liljefors,et al.  A textbook of drug design and development , 1996 .

[38]  Eric J. Martin,et al.  Conformational Sampling of Bioactive Molecules: A Comparative Study , 2007, J. Chem. Inf. Model..