CompoundProtein Interaction Prediction Within Chemogenomics: Theoretical Concepts, Practical Usage, and Future Directions

With advancements in high‐throughput technologies and open availability of bioassay data, computational methods to generate models, that zoom out from a single protein with a focused ligand set to a larger and more comprehensive description of compound‐protein interactions and furthermore demonstrate subsequent translational validity in prospective experiments, are of prime importance. In this article, we discuss some of the new benefits and challenges of the emerging computational chemogenomics paradigm, particularly with respect to compound‐protein interaction. Examples of experimentally validated computational predictions and recent trends in molecular feature extraction are presented. In addition, analyses of cross‐family interactions are considered. We also discuss the expected role of computational chemogenomics in contributing to increasingly expansive network‐level modeling and screening projects.

[1]  Matthias Rarey,et al.  LoFT: Similarity-Driven Multiobjective Focused Library Design , 2010, J. Chem. Inf. Model..

[2]  Thomas Gärtner,et al.  Ligand Prediction from Protein Sequence and Small Molecule Information Using Support Vector Machines and Fingerprint Descriptors , 2009, J. Chem. Inf. Model..

[3]  Yasushi Okuno,et al.  GLIDA: GPCR-ligand database for chemical genomic drug discovery , 2005, Nucleic Acids Res..

[4]  Claudio Chuaqui,et al.  Structural Interaction Fingerprints: A New Approach to Organizing, Mining, Analyzing, and Designing Protein–Small Molecule Complexes , 2006, Chemical biology & drug design.

[5]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[6]  Petr Heneberg On Bibliometric Analysis of Chinese Research on Cyclization, MALDI-TOF, and Antibiotics: Methodical Concerns , 2011, J. Chem. Inf. Model..

[7]  Hanna Geppert,et al.  Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation , 2010, J. Chem. Inf. Model..

[8]  Alice McCarthy Drug discovery in the clouds. , 2012, Chemistry & biology.

[9]  T. Ashburn,et al.  Drug repositioning: identifying and developing new uses for existing drugs , 2004, Nature Reviews Drug Discovery.

[10]  Didier Rognan,et al.  Encoding Protein-Ligand Interaction Patterns in Fingerprints and Graphs , 2013, J. Chem. Inf. Model..

[11]  Andreas D. Baxevanis,et al.  The Molecular Biology Database Collection: an online compilation of relevant database resources , 2000, Nucleic Acids Res..

[12]  Jing Zhang,et al.  Correction: High Prevalence of Human Parvovirus 4 Infection in HBV and HCV Infected Individuals in Shanghai , 2012, PLoS ONE.

[13]  Tatsuya Akutsu,et al.  Protein homology detection using string alignment kernels , 2004, Bioinform..

[14]  Hitoshi Harada,et al.  Antidiabetic and hypolipidemic effects of a novel dual peroxisome proliferator-activated receptor (PPAR) alpha/gamma agonist, E3030, in db/db mice and beagle dogs. , 2008, Journal of pharmacological sciences.

[15]  Alexander Tropsha,et al.  Recent trends in statistical QSAR modeling of environmental chemical toxicity. , 2012, Experientia supplementum.

[16]  Yasushi Okuno,et al.  Systems biology and systems chemistry: new directions for drug discovery. , 2012, Chemistry & biology.

[17]  S. Emanuel,et al.  The in vitro and in vivo effects of JNJ-7706621: a dual inhibitor of cyclin-dependent kinases and aurora kinases. , 2005, Cancer research.

[18]  D. Rognan Chemogenomic approaches to rational drug design , 2007, British journal of pharmacology.

[19]  Olivier Poch,et al.  KD4v: comprehensible knowledge discovery system for missense variant , 2012, Nucleic Acids Res..

[20]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[21]  Teruki Honma,et al.  Combining Machine Learning and Pharmacophore-Based Interaction Fingerprint for in Silico Screening , 2010, J. Chem. Inf. Model..

[22]  Ivan Rusyn,et al.  Predictive modeling of chemical hazard by integrating numerical descriptors of chemical structures and short-term toxicity assay data. , 2012, Toxicological sciences : an official journal of the Society of Toxicology.

[23]  T. Lundstedt,et al.  Proteochemometrics modeling of the interaction of amine G-protein coupled receptors with a diverse set of ligands. , 2002, Molecular pharmacology.

[24]  D Horvath,et al.  Interpretability of SAR/QSAR Models of any Complexity by Atomic Contributions , 2012, Molecular informatics.

[25]  Michael M. Hann,et al.  RECAP-Retrosynthetic Combinatorial Analysis Procedure: A Powerful New Technique for Identifying Privileged Molecular Fragments with Useful Applications in Combinatorial Chemistry , 1998, J. Chem. Inf. Comput. Sci..

[26]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[27]  Masahiko Nakatsui,et al.  Chemical Genomics Approach for GPCR-Ligand Interaction Prediction and Extraction of Ligand Binding Determinants , 2013, J. Chem. Inf. Model..

[28]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[29]  Rachel B. Brem,et al.  Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks , 2008, Nature Genetics.

[30]  Neville Ratnaraj,et al.  Isobolographic and behavioral characterizations of interactions between vigabatrin and gabapentin in two experimental models of epilepsy. , 2008, European journal of pharmacology.

[31]  Ola Spjuth,et al.  Proteochemometric Modeling of the Susceptibility of Mutated Variants of the HIV-1 Virus to Reverse Transcriptase Inhibitors , 2010, PloS one.

[32]  E. Jacoby,et al.  Chemogenomics: an emerging strategy for rapid target and drug discovery , 2004, Nature Reviews Genetics.

[33]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[34]  T. Hunter,et al.  The Protein Kinase Complement of the Human Genome , 2002, Science.

[35]  Jürgen Bajorath,et al.  Analysis of structure-based virtual screening studies and characterization of identified active compounds. , 2012, Future medicinal chemistry.

[36]  H. Yabuuchi,et al.  Analysis of multiple compound–protein interactions reveals novel bioactive molecules , 2011, Molecular systems biology.

[37]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[38]  Z. R. Li,et al.  Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[39]  Gilson Luiz Volpato,et al.  Aggressiveness Overcomes Body-Size Effects in Fights Staged between Invasive and Native Fish Species with Overlapping Niches , 2012, PloS one.

[40]  Susan Aldridge Value-driven price deal , 2009, Nature Biotechnology.

[41]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[42]  Jinyan Li,et al.  Twelve C2H2 zinc-finger genes on human chromosome 19 can be each translated into the same type of protein after frameshifts , 2004, Bioinform..

[43]  Philip Cohen,et al.  What's mine is yours , 2000 .

[44]  Tomi K. Sawyer Chemical Biology & Drug Design: Inaugural issue! , 2006 .

[45]  Manju Bansal,et al.  A novel method for prokaryotic promoter prediction based on DNA stability , 2005, BMC Bioinformatics.

[46]  Tudor I. Oprea,et al.  Systems chemical biology. , 2007 .

[47]  J. Bajorath,et al.  Chemoinformatics: a view of the field and current trends in method development. , 2012, Bioorganic & medicinal chemistry.

[48]  Yanli Wang,et al.  PubChem: Integrated Platform of Small Molecules and Biological Activities , 2008 .

[49]  Eric Maréchal Chemogenomics: a discipline at the crossroad of high throughput technologies, biomarker research, combinatorial chemistry, genomics, cheminformatics, bioinformatics and artificial intelligence. , 2008, Combinatorial chemistry & high throughput screening.

[50]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[51]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[52]  J. Mestres,et al.  On the origins of drug polypharmacology , 2013 .

[53]  Bjarni J. Vilhjálmsson,et al.  The nature of confounding in genome-wide association studies , 2012, Nature Reviews Genetics.

[54]  J. Bajorath,et al.  Quo vadis, virtual screening? A comprehensive survey of prospective applications. , 2010, Journal of medicinal chemistry.

[55]  Mi-kyung Kim,et al.  PAR-5359, a well-balanced PPARalpha/gamma dual agonist, exhibits equivalent antidiabetic and hypolipidemic activities in vitro and in vivo. , 2008, European journal of pharmacology.

[56]  Masahiko Nakatsui,et al.  Chemogenomic approach to comprehensive predictions of ligand-target interactions: A comparative study , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[57]  J. Bajorath,et al.  State-of-the-art in ligand-based virtual screening. , 2011, Drug discovery today.

[58]  Michael Y. Galperin,et al.  The 2012 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection , 2011, Nucleic Acids Res..

[59]  Z. Deng,et al.  Structural interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions. , 2004, Journal of medicinal chemistry.

[60]  Tatsuya Akutsu,et al.  Graph Kernels for Molecular Structure-Activity Relationship Analysis with Support Vector Machines , 2005, J. Chem. Inf. Model..

[61]  Peteris Prusis,et al.  Improved approach for proteochemometrics modeling: application to organic compound - amine G protein-coupled receptor interactions , 2005, Bioinform..

[62]  L. Steinmetz,et al.  Extensive transcriptional heterogeneity revealed by isoform profiling , 2013, Nature.

[63]  T. Akutsu,et al.  Compound analysis via graph kernels incorporating chirality. , 2010, Journal of bioinformatics and computational biology.

[64]  Brigitte Evrard,et al.  Benfotiamine, a synthetic S-acyl thiamine derivative, has different mechanisms of action and a different pharmacological profile than lipid-soluble thiamine disulfide derivatives , 2008, BMC pharmacology.

[65]  Kimito Funatsu,et al.  New description of protein-ligand interactions using a spherical self-organizing map. , 2012, Bioorganic & medicinal chemistry.

[66]  Jean-Philippe Vert,et al.  Virtual screening of GPCRs: An in silico chemogenomics approach , 2008, BMC Bioinformatics.

[67]  Yadi Zhou,et al.  Prediction of chemical-protein interactions: multitarget-QSAR versus computational chemogenomic methods. , 2012, Molecular bioSystems.

[68]  S. Friend,et al.  A network view of disease and compound screening , 2009, Nature Reviews Drug Discovery.

[69]  Andreas Bender,et al.  Recognizing Pitfalls in Virtual Screening: A Critical Review , 2012, J. Chem. Inf. Model..

[70]  O. Ottmann,et al.  The effect of the dual Src/Abl kinase inhibitor AZD0530 on Philadelphia positive leukaemia cell lines , 2009, BMC Cancer.

[71]  Yasuo Tabei,et al.  Identification of chemogenomic features from drug–target interaction networks using interpretable classifiers , 2012, Bioinform..

[72]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.

[73]  I Jolanda M de Vries,et al.  Regulation of MYCN expression in human neuroblastoma cells , 2009, BMC Cancer.

[74]  Eckhart G. Hahn,et al.  The dual EGF/VEGF receptor tyrosine kinase inhibitor AEE788 inhibits growth of human hepatocellular carcinoma xenografts in nude mice. , 1992 .

[75]  A. Heck,et al.  Next-generation proteomics: towards an integrative view of proteome dynamics , 2012, Nature Reviews Genetics.

[76]  Satoshi Niijima,et al.  Dissecting Kinase Profiling Data to Predict Activity and Understand Cross-Reactivity of Kinase Inhibitors , 2012, J. Chem. Inf. Model..

[77]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[78]  M. Gerstein,et al.  A question of size: the eukaryotic proteome and the problems in defining it. , 2002, Nucleic acids research.

[79]  K. Tsuda,et al.  Mining Significant Substructure Pairs for Interpreting Polypharmacology in Drug-Target Network , 2011, PloS one.

[80]  Yukihito Higashi,et al.  Endothelial progenitor cells: therapeutic target for cardiovascular diseases. , 2008, Journal of pharmacological sciences.

[81]  Satoshi Niijima,et al.  Cross-Target View to Feature Selection: Identification of Molecular Interaction Features in Ligand-Target Space , 2011, J. Chem. Inf. Model..

[82]  Michael J. Keiser,et al.  The Chemical Basis of Pharmacology , 2010, Biochemistry.

[83]  Justin Lamb,et al.  The Connectivity Map: a new tool for biomedical research , 2007, Nature Reviews Cancer.

[84]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[85]  Jason Weston,et al.  Mismatch string kernels for discriminative protein classification , 2004, Bioinform..

[86]  Yoshihiro Yamanishi,et al.  Protein network inference from multiple genomic data: a supervised approach , 2004, ISMB/ECCB.

[87]  Satoshi Niijima,et al.  GLIDA: GPCR—ligand database for chemical genomics drug discovery—database and tools update , 2007, Nucleic Acids Res..

[88]  M. Fielden,et al.  Preclinical Drug Safety Analysis by Chemogenomic Profiling in the Liver , 2005, American journal of pharmacogenomics : genomics-related research in drug development and clinical practice.

[89]  Kristin P. Bennett,et al.  Support vector machines: hype or hallelujah? , 2000, SKDD.

[90]  Peteris Prusis,et al.  QSAR and proteo-chemometric analysis of the interaction of a series of organic compounds with melanocortin receptor subtypes. , 2003, Journal of medicinal chemistry.

[91]  David R Corey,et al.  RNA learns from antisense. , 2007, Nature chemical biology.

[92]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[93]  Sheng-Tang Wu,et al.  Dual Degradation of Aurora A and B Kinases by the Histone Deacetylase Inhibitor LBH589 Induces G2-M Arrest and Apoptosis of Renal Cancer Cells , 2009, Clinical Cancer Research.

[94]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[95]  Joanna Owens Sweet success for Pfizer , 2006, Nature Reviews Drug Discovery.

[96]  T. Lundstedt,et al.  Proteo-chemometrics analysis of MSH peptide binding to melanocortin receptors. , 2002, Protein engineering.

[97]  P. Sorger,et al.  Systems biology and combination therapy in the quest for clinical efficacy , 2006, Nature chemical biology.

[98]  A. Gill,et al.  Kinetic efficiency: the missing metric for enhancing compound quality? , 2011, Drug discovery today.

[99]  W. Guida,et al.  The art and practice of structure‐based drug design: A molecular modeling perspective , 1996, Medicinal research reviews.

[100]  A. Hopkins Network pharmacology: the next paradigm in drug discovery. , 2008, Nature chemical biology.

[101]  D. Vernon Inform , 1995, Encyclopedia of the UN Sustainable Development Goals.

[102]  Ruth Kirby Medical genetics: Clue to a killer , 2004, Nature Reviews Genetics.