Introduction to Chemoinformatics in Drug Discovery – A Personal View

The first issue to be discussed is the definition of the topic. What is chemoinformatics and why should you care? There is no clear definition, although a consensus view appears to be emerging. ‘‘Chemoinformatics is the mixing of those information resources to transform data into information and information into knowledge for the intended purpose of making better decisions faster in the area of drug lead identification and organization’’ according to one view [1]. Hann and Green suggest that chemoinformatics is simply a new name for an old problem [2], a viewpoint I share. There are sufficient reviews [3–6] and even a book by Leach and Gillet [7] with the topic as their focus that there is little doubt what is meant, despite the absence of a precise definition that is generally accepted. One aspect of a new emphasis is the sheer magnitude of chemical information that must be processed. For example, Chemical Abstracts Service adds over three-quarters of a million new compounds to its database annually, for which large amounts of physical and chemical property data are available. Some groups generate hundreds of thousands to millions of compounds on a regular basis through combinatorial chemistry that are screened for biological activity. Even more compounds are generated and screened in silico in the search for a magic bullet for a given disease. Either one of the two processes for generating information about chemistry has its own limitations. Experimental approaches have practical limitations despite automation; each in vitro bioassay utilizes a finite amount of reagents including valuable cloned and expressed receptors. Computational chemistry has to establish relevant criteria by which to select compounds of interest for synthesis and testing. The accuracy of prediction of affinities with current methodology is just now approaching sufficient accuracy to be of utility. Let me emphasize the magnitude of the problem with a simple example. I was once asked to estimate the number of compounds covered by a typical issued patent for a drug of commercial interest. The patent that I selected to analyze was for enalapril, a prominent prodrug ACE inhibitor with a well-established commercial market. Given the parameters as outlined in the patent covering enalapril, an estimation of the total number of compounds included in the generic claim for enalaprilat, the active

[1]  B L Kalman,et al.  Computer-assisted modeling of the picrotoxinin and gamma-butyrolactone receptor site. , 1983, Molecular pharmacology.

[2]  G. Marshall,et al.  Conformationally restricted TRH analogues: constraining the pyroglutamate region. , 2002, Bioorganic & medicinal chemistry.

[3]  D. Riley Rational design of synthetic enzymes and their potential utility as human pharmaceuticals , 2000 .

[4]  M Karplus,et al.  The Levinthal paradox: yesterday and today. , 1997, Folding & design.

[5]  Tudor I. Oprea,et al.  Chemography: the Art of Navigating in Chemical Space , 2000 .

[6]  Ioan Motoc,et al.  Molecular Shape Descriptors , 1983, Steric Effects in Drug Design.

[7]  Chris M. W. Ho,et al.  DBMAKER: A set of programs to generate three-dimensional databases based upon user-specified criteria , 1995, J. Comput. Aided Mol. Des..

[8]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[9]  R. Cramer,et al.  Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. , 1988, Journal of the American Chemical Society.

[10]  D. Baker,et al.  Prediction of local structure in proteins using a library of sequence-structure motifs. , 1998, Journal of molecular biology.

[11]  R Abagyan,et al.  High-throughput docking for lead generation. , 2001, Current opinion in chemical biology.

[12]  G. Flynn Substituent Constants for Correlation Analysis in Chemistry and Biology. , 1980 .

[13]  J. Sufrin,et al.  Steric mapping of the L-methionine binding site of ATP:L-methionine S-adenosyltransferase. , 1981, Molecular pharmacology.

[14]  P J Goodford,et al.  Drug design by the method of receptor fit. , 1984, Journal of medicinal chemistry.

[15]  P. Gund Three-Dimensional Pharmacophoric Pattern Searching , 1977 .

[16]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[17]  Chris M. W. Ho In Silico Lead Optimization , 2005 .

[18]  R. Natesh,et al.  Crystal structure of the human angiotensin-converting enzyme–lisinopril complex , 2003, Nature.

[19]  W. C. Still,et al.  Semianalytical treatment of solvation for molecular mechanics and dynamics , 1990 .

[20]  Analysis of the Binding Surfaces of Proteins , 1999 .

[21]  M. Caruthers,et al.  Gene synthesis machines: DNA chemistry and its uses. , 1985, Science.

[22]  I. Kuntz,et al.  Ligand solvation in molecular docking , 1999, Proteins.

[23]  A J Olson,et al.  Automated docking in crystallography: Analysis of the substrates of aconitase , 1993, Proteins.

[24]  R. Cramer,et al.  Recent advances in comparative molecular field analysis (CoMFA). , 1989, Progress in clinical and biological research.

[25]  P. Goodford A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. , 1985, Journal of medicinal chemistry.

[26]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[27]  Dorica Mayer,et al.  A unique geometry of the active site of angiotensin-converting enzyme consistent with structure-activity studies , 1987, J. Comput. Aided Mol. Des..

[28]  R. Zwanzig,et al.  Levinthal's paradox. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[29]  A H BECKETT,et al.  SYNTHETIC ANALGESICS: STEREOCHEMICAL CONSIDERATIONS , 1954, The Journal of pharmacy and pharmacology.

[30]  L. Kier,et al.  A theoretical study of receptor site models for trimethylammonium group interaction. , 1974, Journal of theoretical biology.

[31]  T J. Ritchie,et al.  Chemoinformatics: manipulating chemical information to facilitate decision-making in drug discovery. , 2001, Drug discovery today.

[32]  Barry A. Bunin,et al.  A general and expedient method for the solid-phase synthesis of 1,4-benzodiazepine derivatives , 1992 .

[33]  A. Verkman Drug discovery in academia. , 2004, American journal of physiology. Cell physiology.

[34]  Garland R. Marshall,et al.  Properties of intraglobular contacts in proteins: an approach to prediction of tertiary structure , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[35]  G R Marshall,et al.  Ab initio modeling of small, medium, and large loops in proteins. , 2001, Biopolymers.

[36]  Garland R. Marshall,et al.  Constrained Peptidomimetics for TRH: cis-Peptide Bond Analogs , 2000 .

[37]  Garland R. Marshall,et al.  VALIDATE: A New Method for the Receptor-Based Prediction of Binding Affinities of Novel Ligands , 1996 .

[38]  Garland R. Marshall,et al.  The Conformational Parameter in Drug Design: The Active Analog Approach , 1979 .

[39]  Yi Li,et al.  In silico ADME/Tox: why models fail , 2003, J. Comput. Aided Mol. Des..

[40]  Dennis P. Riley,et al.  Computer-Aided Design (CAD) of Synzymes: Use of Molecular Mechanics (MM) for the Rational Design of Superoxide Dismutase Mimics. , 1999, Inorganic chemistry.

[41]  Tudor I. Oprea,et al.  Chemical space navigation in lead discovery. , 2002, Current opinion in chemical biology.

[42]  C. Hansch,et al.  A NEW SUBSTITUENT CONSTANT, PI, DERIVED FROM PARTITION COEFFICIENTS , 1964 .

[43]  R. D. Iii Cramer BC(DEF) PARAMETERS. 1. THE INTRINSIC DIMENSIONALITY OF INTERMOLECULAR INTERACTIONS IN THE LIQUID STATE , 1980 .

[44]  J M Blaney,et al.  A geometric approach to macromolecule-ligand interactions. , 1982, Journal of molecular biology.

[45]  R D Cramer,et al.  Three-dimensional structure-activity relationships. , 1988, Trends in pharmacological sciences.

[46]  R. B. Merrifield Solid phase peptide synthesis. I. the synthesis of a tetrapeptide , 1963 .

[47]  G R Marshall,et al.  Conformationally restricted TRH analogs: a probe for the pyroglutamate region. , 1996, Journal of medicinal chemistry.

[48]  The use of insoluble polymer supports in general organic synthesis , 1978 .

[49]  R Woods,et al.  Conformational analysis and active site modelling of angiotensin-converting enzyme inhibitors. , 1985, Journal of medicinal chemistry.

[50]  Garland R. Marshall,et al.  Electrochemical Cyclization of Dipeptides toward Novel Bicyclic, Reverse-Turn Peptidomimetics. 1. Synthesis and Conformational Analysis of 7,5-Bicyclic Systems , 1995 .

[51]  Tudor I. Oprea Current trends in lead discovery: Are we looking for the appropriate properties? , 2002, J. Comput. Aided Mol. Des..

[52]  R Green,et al.  Chemoinformatics--a new name for an old problem? , 1999, Current opinion in chemical biology.

[53]  Chris M. W. Ho,et al.  SPLICE: A program to assemble partial query solutions from three-dimensional database searches into novel ligands , 1993, J. Comput. Aided Mol. Des..

[54]  G. M. Crippen,et al.  Distance geometry and conformational calculations , 1981 .

[55]  Tudor I. Oprea,et al.  Integrating virtual screening in lead discovery. , 2004, Current opinion in chemical biology.

[56]  Garland R. Marshall,et al.  3D-QSAR of angiotensin-converting enzyme and thermolysin inhibitors: A comparison of CoMFA models based on deduced and experimentally determined active site geometries , 1993 .

[57]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998, J. Comput. Chem..

[58]  Tudor I. Oprea,et al.  Is There a Difference Between Leads and Drugs? A Historical Perspective. , 2001 .

[59]  Eric A Welsh,et al.  ProVal: A protein‐scoring function for the selection of native and near‐native folds , 2003, Proteins.

[60]  G. Marshall,et al.  Molecular Shape Descriptors. 3. Steric Mapping of Biological Receptor , 1985 .

[61]  C. D. Barry,et al.  Molecular requirements for recognition at glucoreceptor for insulin release. , 1979, Molecular pharmacology.

[62]  G. Marshall,et al.  Molecular Shape Descriptors. 1. Three-Dimensional Molecular Shape Descriptor , 1985 .

[63]  D. E. Patterson,et al.  Crossvalidation, Bootstrapping, and Partial Least Squares Compared with Multiple Regression in Conventional QSAR Studies , 1988 .

[64]  D. Baker,et al.  Prospects for ab initio protein structural genomics. , 2001, Journal of molecular biology.

[65]  G. Myatt,et al.  Chem-tox informatics: data mining using a medicinal chemistry building block approach. , 2001, Current opinion in drug discovery & development.

[66]  Todd J. A. Ewing,et al.  DREAM++: Flexible docking program for virtual combinatorial libraries , 1999, J. Comput. Aided Mol. Des..

[67]  P. Hajduk,et al.  Discovering High-Affinity Ligands for Proteins: SAR by NMR , 1996, Science.

[68]  L G Humber,et al.  Mapping the dopamine receptor. 1. Features derived from modifications in ring E of the neuroleptic butaclamol. , 1979, Journal of medicinal chemistry.

[69]  Kim H. Esbensen,et al.  Modelling data tables by principal components and PLS: class patterns and quantitative predictive relations , 1984 .

[70]  Chris M. W. Ho,et al.  Cavity search: An algorithm for the isolation and display of cavity-like binding regions , 1990, J. Comput. Aided Mol. Des..

[71]  Chris M. W. Ho,et al.  FOUNDATION: A program to retrieve all possible structures containing a user-defined minimum number of matching query elements from three-dimensional databases , 1993, J. Comput. Aided Mol. Des..

[72]  Andrew R. Leach,et al.  An Introduction to Chemoinformatics , 2003 .

[73]  F. Brown Chapter 35 – Chemoinformatics: What is it and How does it Impact Drug Discovery. , 1998 .

[74]  Molecular Shape Descriptors. 2. Quantitative Structure-Activity Relationships Based Upon Three-Dimensional Molecular Shape Descriptor , 1985 .

[75]  G E Kellogg,et al.  Allosteric modifiers of hemoglobin. 2. Crystallographically determined binding sites and hydrophobic binding/interaction analysis of novel hemoglobin oxygen effectors. , 1991, Journal of medicinal chemistry.