Benchmarking Commercial Conformer Ensemble Generators

We assess and compare the performance of eight commercial conformer ensemble generators (ConfGen, ConfGenX, cxcalc, iCon, MOE LowModeMD, MOE Stochastic, MOE Conformation Import, and OMEGA) and one leading free algorithm, the distance geometry algorithm implemented in RDKit. The comparative study is based on a new version of the Platinum Diverse Dataset, a high-quality benchmarking dataset of 2859 protein-bound ligand conformations extracted from the PDB. Differences in the performance of commercial algorithms are much smaller than those observed for free algorithms in our previous study (J. Chem. Inf. MODEL 2017, 57, 529-539). For commercial algorithms, the median minimum root-mean-square deviations measured between protein-bound ligand conformations and ensembles of a maximum of 250 conformers are between 0.46 and 0.61 Å. Commercial conformer ensemble generators are characterized by their high robustness, with at least 99% of all input molecules successfully processed and few or even no substantial geometrical errors detectable in their output conformations. The RDKit distance geometry algorithm (with minimization enabled) appears to be a good free alternative since its performance is comparable to that of the midranked commercial algorithms. Based on a statistical analysis, we elaborate on which algorithms to use and how to parametrize them for best performance in different application scenarios.

[1]  Matthias Rarey,et al.  NAOMI: On the Almost Trivial Task of Reading Molecules from Different File formats , 2011, J. Chem. Inf. Model..

[2]  Paul Labute,et al.  LowModeMD - Implicit Low-Mode Velocity Filtering Applied to Conformational Search of Macrocycles and Protein Loops , 2010, J. Chem. Inf. Model..

[3]  Jennifer L. Knight,et al.  OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins. , 2016, Journal of chemical theory and computation.

[4]  Mark S. Johnson,et al.  Generating Conformer Ensembles Using a Multiobjective Genetic Algorithm , 2007, J. Chem. Inf. Model..

[5]  Anita R. Maguire,et al.  Confab - Systematic generation of diverse low-energy conformers , 2011, J. Cheminformatics.

[6]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[7]  Anthony Nicholls,et al.  Essential considerations for using protein-ligand structures in drug discovery. , 2012, Drug discovery today.

[8]  Paolo Tosco,et al.  Bringing the MMFF force field to the RDKit: implementation and validation , 2014, Journal of Cheminformatics.

[9]  Woody Sherman,et al.  ConfGen: A Conformational Search Method for Efficient Generation of Bioactive Conformers , 2010, J. Chem. Inf. Model..

[10]  T. Halgren MMFF VI. MMFF94s option for energy minimization studies , 1999, J. Comput. Chem..

[11]  J. Zou,et al.  Improved methods for building protein models in electron density maps and the location of errors in these models. , 1991, Acta crystallographica. Section A, Foundations of crystallography.

[12]  David Lagorce,et al.  MS-DOCK: Accurate multiple conformation generator and rigid docking protocol for multi-step virtual ligand screening , 2008, BMC Bioinformatics.

[13]  P. Hawkins Conformation Generation: The State of the Art , 2017, J. Chem. Inf. Model..

[14]  Thierry Langer,et al.  LigandScout: 3-D Pharmacophores Derived from Protein-Bound Ligands and Their Use as Virtual Screening Filters , 2005, J. Chem. Inf. Model..

[15]  T. Halgren Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94 , 1996, J. Comput. Chem..

[16]  D. Cruickshank,et al.  Remarks about protein structure precision. , 1999, Acta crystallographica. Section D, Biological crystallography.

[17]  Benjamin A. Ellingson,et al.  Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database , 2010, J. Chem. Inf. Model..

[18]  Matthias Rarey,et al.  Estimating Electron Density Support for Individual Atoms and Molecular Fragments in X-ray Structures , 2017, J. Chem. Inf. Model..

[19]  Sereina Riniker,et al.  Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation , 2015, J. Chem. Inf. Model..

[20]  Jens Meiler,et al.  BCL::Conf: small molecule conformational sampling using a knowledge based rotamer library , 2015, Journal of Cheminformatics.

[21]  S. L. Mayo,et al.  DREIDING: A generic force field for molecular simulations , 1990 .

[22]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[23]  Matthias Rarey,et al.  CONFECT: Conformations from an Expert Collection of Torsion Patterns , 2013, ChemMedChem.

[24]  Charlotte M. Deane,et al.  Freely Available Conformer Generation Methods: How Good Are They? , 2012, J. Chem. Inf. Model..

[25]  G. Murshudov,et al.  Refinement of macromolecular structures by the maximum-likelihood method. , 1997, Acta crystallographica. Section D, Biological crystallography.

[26]  Matthias Rarey,et al.  High-Quality Dataset of Protein-Bound Ligand Conformations and Its Application to Benchmarking Conformer Ensemble Generators , 2017, J. Chem. Inf. Model..

[27]  Nicolas Foloppe,et al.  Conformational Sampling of Druglike Molecules with MOE and Catalyst: Implications for Pharmacophore Modeling and Virtual Screening , 2008, J. Chem. Inf. Model..

[28]  Pierre Tufféry,et al.  Frog2: Efficient 3D conformation ensemble generator for small compounds , 2010, Nucleic Acids Res..