Ligand expansion in ligand-based virtual screening using relevance feedback

Query expansion is the process of reformulating an original query to improve retrieval performance in information retrieval systems. Relevance feedback is one of the most useful query modification techniques in information retrieval systems. In this paper, we introduce query expansion into ligand-based virtual screening (LBVS) using the relevance feedback technique. In this approach, a few high-ranking molecules of unknown activity are filtered from the outputs of a Bayesian inference network based on a single ligand molecule to form a set of ligand molecules. This set of ligand molecules is used to form a new ligand molecule. Simulated virtual screening experiments with the MDL Drug Data Report and maximum unbiased validation data sets show that the use of ligand expansion provides a very simple way of improving the LBVS, especially when the active molecules being sought have a high degree of structural heterogeneity. However, the effectiveness of the ligand expansion is slightly less when structurally-homogeneous sets of actives are being sought.

[1]  Naomie Salim,et al.  Ligand-Based Virtual Screening Using Bayesian Networks , 2010, J. Chem. Inf. Model..

[2]  Naomie Salim,et al.  Similarity-Based Virtual Screening Using Bayesian Inference Network: Enhanced Search Using 2D Fingerprints and Multiple Reference Structures , 2009 .

[3]  Marvin Johnson,et al.  Concepts and applications of molecular similarity , 1990 .

[4]  B. Fan,et al.  Molecular similarity and diversity in chemoinformatics: From theory to applications , 2006, Molecular Diversity.

[5]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[6]  Naomie Salim,et al.  Bayesian inference network significantly improves the effectiveness of similarity searching using multiple 2D fingerprints and multiple reference structures , 2009 .

[7]  Robert P Sheridan,et al.  Why do we need so many chemical similarity search methods? , 2002, Drug discovery today.

[8]  Naomie Salim,et al.  Similarity‐Based Virtual Screening with a Bayesian Inference Network , 2009, ChemMedChem.

[9]  N. Nikolova,et al.  International Union of Pure and Applied Chemistry, LUMO energy ± The Lowest Unoccupied Molecular Orbital (LUMO) , 2022 .

[10]  R. Venkataraghavan,et al.  Atom pairs as molecular features in structure-activity studies: definition and applications , 1985, J. Chem. Inf. Comput. Sci..

[11]  J. Pin,et al.  Virtual screening workflow development guided by the "receiver operating characteristic" curve approach. Application to high-throughput docking on metabotropic glutamate receptor subtype 4. , 2005, Journal of medicinal chemistry.

[12]  Bernice W. Polemis Nonparametric Statistics for the Behavioral Sciences , 1959 .

[13]  Vicente P. Guerrero-Bote,et al.  Genetic algorithms in relevance feedback: a second test and new contributions , 2003, Inf. Process. Manag..

[14]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[15]  Jérôme Hert,et al.  Turbo similarity searching: Effect of fingerprint and dataset on virtual‐screening performance , 2009, Stat. Anal. Data Min..

[16]  Jérôme Hert,et al.  New Methods for Ligand-Based Virtual Screening: Use of Data Fusion and Machine Learning to Enhance the Effectiveness of Similarity Searching , 2006, J. Chem. Inf. Model..

[17]  Abdelmajid Ben Hamadou,et al.  Query Reformulation Based on Relevance Feedback , 2009, FQAS.

[18]  Fabrício Olivetti de França,et al.  Query expansion using an immune-inspired biclustering algorithm , 2010, Natural Computing.

[19]  Naomie Salim,et al.  New Fragment Weighting Scheme for the Bayesian Inference Network in Ligand-Based Virtual Screening , 2011, J. Chem. Inf. Model..

[20]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[21]  Andrew R. Leach,et al.  An Introduction to Chemoinformatics , 2003 .

[22]  Pang-Ning Tan,et al.  Receiver Operating Characteristic , 2009, Encyclopedia of Database Systems.

[23]  P. Willett,et al.  Enhancing the effectiveness of similarity-based virtual screening using nearest-neighbor information. , 2005, Journal of medicinal chemistry.

[24]  Sebastian G. Rohrer,et al.  Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data , 2009, J. Chem. Inf. Model..

[25]  Naomie Salim,et al.  Implementing Relevance Feedback in Ligand-Based Virtual Screening Using Bayesian Inference Network , 2011, Journal of biomolecular screening.

[26]  Naomie Salim,et al.  Ligand-based virtual screening using Bayesian inference network , 2011 .