Optimization and visualization of the edge weights in optimal assignment methods for virtual screening

BackgroundLigand‐based virtual screening plays a fundamental part in the early drug discovery stage. In a virtual screening, a chemical library is searched for molecules with similar properties to a query molecule by means of a similarity function. The optimal assignment of chemical graphs has proven to be a valuable similarity function for many cheminformatic tasks, such as virtual screening. The optimal assignment assumes all atoms of a query molecule to be equally important, which is not realistic depending on the binding mode of a ligand. The importance of a query molecule’s atoms can be integrated in the optimal assignment by weighting the assignment edges. We optimized the edge weights with respect to the virtual screening performance by means of evolutionary algorithms. Furthermore, we propose a visualization approach for the interpretation of the edge weights.ResultsWe evaluated two different evolutionary algorithms, differential evolution and particle swarm optimization, for their suitability for optimizing the assignment edge weights. The results showed that both optimization methods are suited to optimize the edge weights. Furthermore, we compared our approach to the optimal assignment with equal edge weights and two literature similarity functions on a subset of the Directory of Useful Decoys using sophisticated virtual screening performance metrics. Our approach achieved a considerably better overall and early enrichment performance. The visualization of the edge weights enables the identification of substructures that are important for a good retrieval of ligands and for the binding to the protein target.ConclusionsThe optimization of the edge weights in optimal assignment methods is a valuable approach for ligand‐based virtual screening experiments. The approach can be applied to any similarity function that employs the optimal assignment method, which includes a variety of similarity measures that have proven to be valuable in various cheminformatic tasks. The proposed visualization helps to get a better understanding of the binding mode of the analyzed query molecule.

[1]  Andreas Zell,et al.  4D Flexible Atom-Pairs: An efficient probabilistic conformational space comparison for ligand-based virtual screening , 2011, J. Cheminformatics.

[2]  Andreas Zell,et al.  4 D Flexible Atom-Pairs : An efficient probabilistic conformational space comparison for ligand-based virtual screening , 2011 .

[3]  Robert D. Clark,et al.  Managing bias in ROC curves , 2008, J. Comput. Aided Mol. Des..

[4]  Andreas Zell,et al.  Optimal assignment kernels for attributed molecular graphs , 2005, ICML.

[5]  Xiaodong Li,et al.  Particle swarm with speciation and adaptation in a dynamic environment , 2006, GECCO.

[6]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[7]  Tudor I. Oprea,et al.  Is There a Difference between Leads and Drugs? A Historical Perspective , 2001, J. Chem. Inf. Comput. Sci..

[8]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[9]  Tudor I. Oprea,et al.  Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? , 2008, J. Comput. Aided Mol. Des..

[10]  Robert P Sheridan,et al.  Why do we need so many chemical similarity search methods? , 2002, Drug discovery today.

[11]  Hanna Geppert,et al.  Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation , 2010, J. Chem. Inf. Model..

[12]  Andrew C. Good,et al.  Measuring CAMD technique performance: A virtual screening case study in the design of validation experiments , 2004, J. Comput. Aided Mol. Des..

[13]  Jeremy G. Vinter,et al.  FieldScreen: Virtual Screening Using Molecular Fields. Application to the DUD Data Set , 2008, J. Chem. Inf. Model..

[14]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[15]  Andreas Zell,et al.  Gaussian Process Assisted Particle Swarm Optimization , 2010, LION.

[16]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[17]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[18]  Efrén Mezura-Montes,et al.  Comparing bio-inspired algorithms in constrained optimization problems , 2007, 2007 IEEE Congress on Evolutionary Computation.

[19]  Andreas Zell,et al.  Atomic Local Neighborhood Flexibility Incorporation into a Structured Similarity Measure for QSAR , 2009, J. Chem. Inf. Model..

[20]  Brian K. Shoichet,et al.  Virtual screening of chemical libraries , 2004, Nature.

[21]  Thomas Sander,et al.  Flexophore, a New Versatile 3D Pharmacophore Descriptor That Considers Molecular Flexibility , 2008, J. Chem. Inf. Model..

[22]  Simona Distinto,et al.  Evaluation of the performance of 3D virtual screening protocols: RMSD comparisons, enrichment assessments, and decoy selection—What can we learn from earlier mistakes? , 2008, J. Comput. Aided Mol. Des..

[23]  Ajay N. Jain,et al.  Recommendations for evaluation of computational methods , 2008, J. Comput. Aided Mol. Des..

[24]  Klaus Obermayer,et al.  A Maximum Common Subgraph Kernel Method for Predicting the Chromosome Aberration Test , 2010, J. Chem. Inf. Model..

[25]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[26]  Andreas Zell,et al.  Optimizing the Edge Weights in Optimal Assignment Methods for Virtual Screening with Particle Swarm Optimization , 2012, EvoBIO.

[27]  J. Kennedy,et al.  Population structure and particle swarm performance , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[28]  Christopher I. Bayly,et al.  Evaluating Virtual Screening Methods: Good and Bad Metrics for the "Early Recognition" Problem , 2007, J. Chem. Inf. Model..

[29]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[30]  H. L. Morgan The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. , 1965 .

[31]  Martin Middendorf,et al.  A Particle Swarm Optimizer for Finding Minimum Free Energy RNA Secondary Structures , 2007, 2007 IEEE Swarm Intelligence Symposium.

[32]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[33]  Andreas Zell,et al.  Modeling metabolic networks in C . glutamicum : a comparison of rate laws in combination with various parameter optimization strategies , 2009 .

[34]  James L. Melville,et al.  Better than Random? The Chemotype Enrichment Problem , 2009, J. Chem. Inf. Model..

[35]  Saeed Shiry Ghidary,et al.  Mobile robot global localization using differential evolution and particle swarm optimization , 2007, 2007 IEEE Congress on Evolutionary Computation.

[36]  John J. Irwin,et al.  Community benchmarks for virtual screening , 2008, J. Comput. Aided Mol. Des..

[37]  David Alland,et al.  Targeting Tuberculosis and Malaria through Inhibition of Enoyl Reductase , 2003, Journal of Biological Chemistry.

[38]  Andreas Zell,et al.  Optimal assignment methods for ligand-based virtual screening , 2009, J. Cheminformatics.

[39]  Egon L. Willighagen,et al.  The Blue Obelisk—Interoperability in Chemical Informatics , 2006, J. Chem. Inf. Model..

[40]  Andreas Zell,et al.  Kernel Functions for Attributed Molecular Graphs – A New Similarity‐Based Approach to ADME Prediction in Classification and Regression , 2006 .

[41]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[42]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[43]  R. W. Derksen,et al.  Differential Evolution in Aerodynamic Optimization , 1999 .

[44]  Andreas Bender,et al.  How Similar Are Similarity Searching Methods? A Principal Component Analysis of Molecular Descriptor Space , 2009, J. Chem. Inf. Model..

[45]  Andreas Zell,et al.  Boltzmann‐Enhanced Flexible Atom‐Pair Kernel with Dynamic Dimension Reduction , 2011, Molecular informatics.

[46]  Anthony Nicholls,et al.  What do we know and when do we know it? , 2008, J. Comput. Aided Mol. Des..

[47]  Andreas Bender,et al.  Similarity Searching of Chemical Databases Using Atom Environment Descriptors (MOLPRINT 2D): Evaluation of Performance , 2004, J. Chem. Inf. Model..

[48]  Jürgen Bajorath,et al.  Integration of virtual and high-throughput screening , 2002, Nature Reviews Drug Discovery.

[49]  René Thomsen,et al.  A comparative study of differential evolution, particle swarm optimization, and evolutionary algorithms on numerical benchmark problems , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[50]  Carlos A. Coello Coello,et al.  A comparative study of differential evolution variants for global optimization , 2006, GECCO.

[51]  Katrin Stierand,et al.  Drawing the PDB: Protein-Ligand Complexes in Two Dimensions. , 2010, ACS medicinal chemistry letters.

[52]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[53]  Andreas Zell,et al.  Evaluation of the performance of evolutionary algorithms for optimization of low-enthalpy geothermal heating plants , 2012, GECCO '12.

[54]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[55]  Marco Dorigo,et al.  Distributed Optimization by Ant Colonies , 1992 .

[56]  Andreas Zell,et al.  The EvA2 Optimization Framework , 2010, LION.