Large-scale reverse docking profiles and their applications

BackgroundReverse docking approaches have been explored in previous studies on drug discovery to overcome some problems in traditional virtual screening. However, current reverse docking approaches are problematic in that the target spaces of those studies were rather small, and their applications were limited to identifying new drug targets. In this study, we expanded the scope of target space to a set of all protein structures currently available and developed several new applications of reverse docking method.ResultsWe generated 2D Matrix of docking scores among all the possible protein structures in yeast and human and 35 famous drugs. By clustering the docking profile data and then comparing them with fingerprint-based clustering of drugs, we first showed that our data contained accurate information on their chemical properties. Next, we showed that our method could be used to predict the druggability of target proteins. We also showed that a combination of sequence similarity and docking profile similarity could predict the enzyme EC numbers more accurately than sequence similarity alone. In two case studies, 5-flurouracil and cycloheximide, we showed that our method can successfully find identifying target proteins.ConclusionsBy using a large number of protein structures, we improved the sensitivity of reverse docking and showed that using as many protein structure as possible was important in finding real binding targets.

[1]  R. Glen,et al.  Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. , 1995, Journal of molecular biology.

[2]  N. Paul,et al.  Recovering the true targets of specific ligands by virtual screening of the protein data bank , 2004, Proteins.

[3]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[4]  H. Guchelaar,et al.  Flucytosine: a review of its pharmacology, clinical indications, pharmacokinetics, toxicity and drug interactions. , 2000, The Journal of antimicrobial chemotherapy.

[5]  S. Lampel,et al.  The druggable genome: an update. , 2005, Drug discovery today.

[6]  Stefan Schmitt,et al.  DrugPred: A Structure-Based Approach To Predict Protein Druggability Developed Using an Extensive Nonredundant Data Set , 2011, J. Chem. Inf. Model..

[7]  Michael I. Jordan,et al.  Chemogenomic profiling: identifying the functional interactions of small molecules in yeast. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  R. Marmorstein,et al.  Structure of the yeast Hst2 protein deacetylase in ternary complex with 2'-O-acetyl ADP ribose and histone peptide. , 2003, Structure.

[9]  Olivier Sperandio,et al.  Receptor-based computational screening of compound databases: the main docking-scoring engines. , 2006, Current protein & peptide science.

[10]  J. C. Hinshaw,et al.  Discovering Modes of Action for Therapeutic Compounds Using a Genome-Wide Screen of Yeast Heterozygotes , 2004, Cell.

[11]  Y.Z. Chen,et al.  Ligand–protein inverse docking and its potential use in the computer search of protein targets of a small molecule , 2001, Proteins.

[12]  X. Barril,et al.  Understanding and predicting druggability. A high-throughput method for detection of drug binding sites. , 2010, Journal of medicinal chemistry.

[13]  S. Karlin,et al.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Robert P. Sheridan,et al.  Drug-like Density: A Method of Quantifying the "Bindability" of a Protein Target Based on a Very Large Set of Pockets and Drug-like Ligands from the Protein Data Bank , 2010, J. Chem. Inf. Model..

[15]  Erik L. L. Sonnhammer,et al.  Inparanoid: a comprehensive database of eukaryotic orthologs , 2004, Nucleic Acids Res..

[16]  P. Imming,et al.  Drugs, their targets and the nature and number of drug targets , 2006, Nature Reviews Drug Discovery.

[17]  Yu Zong Chen,et al.  Support vector machines approach for predicting druggable proteins: recent progress in its exploration and investigation of its usefulness. , 2007, Drug discovery today.

[18]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[19]  Xiaomin Luo,et al.  PDTD: a web-accessible protein database for drug target identification , 2008, BMC Bioinformatics.

[20]  Isabella Morlini,et al.  An Overall Index for Comparing Hierarchical Clusterings , 2010, GfKl.

[21]  Y. Cheng,et al.  Metabolism and mechanism of action of 5-fluorouracil. , 1990, Pharmacology & therapeutics.

[22]  Vincent Le Guilloux,et al.  fpocket: online tools for protein ensemble pocket detection and tracking , 2010, Nucleic Acids Res..

[23]  Erik L. L. Sonnhammer,et al.  InParanoid 7: new algorithms and tools for eukaryotic orthology analysis , 2009, Nucleic Acids Res..

[24]  Philip E. Bourne,et al.  A unified statistical model to support local sequence order independent similarity searching for ligand-binding sites and its application to genome-based drug discovery , 2009, Bioinform..

[25]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[26]  Ronald W. Davis,et al.  Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. , 1999, Science.

[27]  Ronen Marmorstein,et al.  Nuclear export modulates the cytoplasmic Sir2 homologue Hst2 , 2006, EMBO reports.

[28]  Lennart Martens,et al.  The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases , 2007, BMC Bioinformatics.

[29]  R. Marmorstein,et al.  Structure and autoregulation of the yeast Hst2 homolog of Sir2 , 2003, Nature Structural Biology.

[30]  Arthur Dalby,et al.  Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited , 1992, J. Chem. Inf. Comput. Sci..

[31]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[32]  Jean-Philippe Vert,et al.  A new protein binding pocket similarity measure based on comparison of clouds of atoms in 3D: application to ligand prediction , 2010, BMC Bioinformatics.

[33]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[34]  S. Bryant,et al.  PubChem as a public resource for drug discovery. , 2010, Drug discovery today.

[35]  Vincent Le Guilloux,et al.  Fpocket: An open source platform for ligand pocket detection , 2009, BMC Bioinformatics.

[36]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.

[37]  Qing Zhang,et al.  The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema , 2004, Nucleic Acids Res..

[38]  Anette Thyssen Jonstrup,et al.  Structure of the nuclear exosome component Rrp6p reveals an interplay between the active site and the HRDC domain , 2006, Proceedings of the National Academy of Sciences.

[39]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[40]  Fu Wei,et al.  Evaluation of various inverse docking schemes in multiple targets identification. , 2010, Journal of molecular graphics & modelling.

[41]  Anvita Gupta,et al.  Structural models in the assessment of protein druggability based on HTS data , 2009, J. Comput. Aided Mol. Des..

[42]  J. Thornton,et al.  Shape variation in protein binding pockets and their ligands. , 2007, Journal of molecular biology.

[43]  E. Webb Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. , 1992 .

[44]  Jian Zhang,et al.  Peptide deformylase is a potential target for anti‐Helicobacter pylori drugs: Reverse docking, enzymatic assay, and X‐ray crystallography validation , 2006, Protein science : a publication of the Protein Society.

[45]  Richard D. Taylor,et al.  Improved protein–ligand docking using GOLD , 2003, Proteins.

[46]  Robert P. St.Onge,et al.  The Chemical Genomic Portrait of Yeast: Uncovering a Phenotype for All Genes , 2008, Science.

[47]  Lars Schmidt-Thieme,et al.  Challenges at the Interface of Data Analysis, Computer Science, and Optimization - Proceedings of the 34th Annual Conference of the Gesellschaft für Klassifikation e. V., Karlsruhe, July 21 - 23, 2010 , 2012, GfKl.

[48]  Daniel R. Caffrey,et al.  Structure-based maximal affinity model predicts small-molecule druggability , 2007, Nature Biotechnology.

[49]  Stefan Stamm,et al.  Appendix A1: Yeast Nomenclature Systematic Open Reading Frame (ORF) and Other Genetic Designations , 2012 .

[50]  J. Strathern,et al.  HST1, a new member of the SIR2 family of genes , 1996, Yeast.