A machine learning approach to computer-aided molecular design

SummaryPreliminary results of a machine learning application concerning computer-aided molecular design applied to drug discovery are presented. The artificial intelligence techniques of machine learning use a sample of active and inactive compounds, which is viewed as a set of positive and negative examples, to allow the induction of a molecular model characterizing the interaction between the compounds and a target molecule. The algorithm is based on a twofold phase. In the first one — the specialization step — the program identifies a number of active/inactive pairs of compounds which appear to be the most useful in order to make the learning process as effective as possible and generates a dictionary of molecular fragments, deemed to be responsible for the activity of the compounds. In the second phase — the generalization step — the fragments thus generated are combined and generalized in order to select the most plausible hypothesis with respect to the sample of compounds. A knowledge base concerning physical and chemical properties is utilized during the inductive process.

[1]  B. Kowalski,et al.  The application of pattern recognition to screening prospective anticancer drugs. Adenocarcinoma 755 biological activity test. , 1974, Journal of the American Chemical Society.

[2]  J. Feder,et al.  Inhibition of thermolysin by dipeptides. , 1974, Biochemistry.

[3]  M. Shapiro,et al.  Pattern recognitiion and structure-activity relationship studies. Computer-assisted prediction of antitumor activity in structurally diverse drugs in an experimental mouse brain tumor system. , 1975, Journal of medicinal chemistry.

[4]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[5]  Y. Martin,et al.  Quantitative drug design , 1978 .

[6]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[7]  J. Powers,et al.  Peptide hydroxamic acids as inhibitors of thermolysin. , 1978, Biochemistry.

[8]  J. Powers,et al.  Design of potent reversible inhibitors for thermolysin. Peptides containing zinc coordinating ligands and their use in affinity chromatography. , 1979, Biochemistry.

[9]  J. Powers,et al.  Inhibition of thermolysin and carboxypeptidase A by phosphoramidates. , 1979, Biochemistry.

[10]  B. Matthews,et al.  Binding of hydroxamic acid inhibitors to crystalline thermolysin suggests a pentacoordinate zinc intermediate in catalysis. , 1982, Biochemistry.

[11]  G. Klopman Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules , 1985 .

[12]  Thomas G. Dietterich,et al.  Selecting Appropriate Representations for Learning from Examples , 1986, AAAI.

[13]  William C. Herndon,et al.  The Similarity of Graphs and Molecules , 1986 .

[14]  Pim de Voogt,et al.  Calculation of Molecular Volumes from Molecular Fragments via Valence Electron Indices , 1989 .

[15]  H Ichikawa,et al.  Neural networks applied to structure-activity relationships. , 1990, Journal of medicinal chemistry.

[16]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .