Predicting biochemical interactions - human P450 2D6 enzyme inhibition

In silico screening of chemical libraries or virtual chemicals may reduce drug discovery and medicine optimisation lead times and increase the probability of success by directing search through chemical space. About a dozen intelligent pharmaceutical QSAR modelling techniques were used to predict IC50 concentration (three classes) of drug interaction with a cell wall enzyme (P450 CYC2D6). Genetic programming gave comprehensible cheminformatics models which generalised best. This was shown by a blind test on Glaxo Welcome molecules of machine learning knowledge nuggets mined from SmithKline Beecham compounds. Performance on similar chemicals (interpolation) and diverse chemicals (extrapolation) suggest generalisation is more difficult than avoiding over fitting. Two GP approaches, classification via regression using a multiobjective fitness measure and a direct winner takes all (WTA) or one versus all (OVA) classification, are described. Predictive rules were compressed by separate follow up GP runs seeded with the best program.

[1]  William B. Langdon,et al.  Combining Decision Trees and Neural Networks for Drug Discovery , 2002, EuroGP.

[2]  Simon Ball,et al.  Pharmacogenetics and drug metabolism , 1997, Nature Biotechnology.

[3]  Vidroha Debroy,et al.  Genetic Programming , 1998, Lecture Notes in Computer Science.

[4]  Anikó Ekárt,et al.  Shorter Fitness Preserving Genetic Programs , 1999, Artificial Evolution.

[5]  W. B. Langdon,et al.  Genetic Programming and Data Structures , 1998, The Springer International Series in Engineering and Computer Science.

[6]  M J Cupp,et al.  Cytochrome P450: new nomenclature and clinical implications. , 1998, American family physician.

[7]  Matthew D. Segall,et al.  First principles calculation of the activity of cytochrome P450 , 1998 .

[8]  William B. Langdon,et al.  Genetic programming for combining neural networks for drug discovery , 2002 .

[9]  J. K. Kinnear,et al.  Alternatives in automatic function definition: a comparison of performance , 1994 .

[10]  William B. Langdon,et al.  Size Fair and Homologous Tree Crossovers for Tree Genetic Programming , 2000, Genetic Programming and Evolvable Machines.

[11]  Roland Olsson,et al.  Inductive Functional Programming Using Incremental Program Transformation , 1995, Artif. Intell..

[12]  William B. Langdon,et al.  Size fair and homologous tree genetic programming crossovers , 1999 .

[13]  Nicholas S. Flann,et al.  Improving the accuracy and robustness of genetic programming through expression simplification , 1996 .

[14]  Vic Ciesielski,et al.  Representing classification problems in genetic programming , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[15]  Peter J. Angeline,et al.  Multiple Interacting Programs: a Representation for Evolving Complex Behavior , 1998, Cybern. Syst..

[16]  William B. Langdon,et al.  Comparison of AdaBoost and Genetic Programming for Combining Neural Networks for Drug Discovery , 2003, EvoWorkshops.

[17]  José Ignacio Hidalgo,et al.  Transformation of Equational Specification by Means of Genetic Programming , 2002, EuroGP.