Neural Models and Extracted Rules for Knowledge Discovery in Predictive Toxicology

We are using neural networks as a tool for predicting chemically-induced carcinogenesis in rodents by training on data derived from a long series of expensive and time-consuming animal tests. Neural networks have shown to be a capable model for accomplishing this task, providing results as good or better than other approaches to the same problem. A new approach to relevant feature subset selection is presented which uses the connection weights of a trained network to assign relevance weights to the attributes; a threshold is then determined by hill climbing. Our Single Hidden Unit Method is shown to provide good results in reasonable time compared with other feature selection methods. Once a network was trained, its weight matrix was pruned in anticipation of rule extraction. Our iterative method is shown to be capable of pruning roughly three-fourths of the connections while improving accuracy. Finally, rule extraction is investigated as a means for networks to explain themselves. A brute force approach to rule extraction in which all possible inputs are listed as rules and the rules are then collapsed to M-of-N rules is shown to build a reasonably small rule set that only suffers a small drop in accuracy from the neural network. An algorithm is presented for the brute force approach which allows it to finish in reasonable time. The set of 22 M-of-N rules so derived are readable and useful for describing the knowledge learned by the network in terms that humans can understand. By applying these new tools to the field of predictive toxicology, a network is trained that is estimated to have good predictive accuracy relative to other efforts in this field. In addition, the results from feature selection and the extracted rules provide new information to predictive toxicologists that is interesting because of the new approach, provocative results, and potential for pointing the way toward new insights

[1]  J. Huff,et al.  Long-term chemical carcinogenesis experiments for identifying potential human cancer hazards: collective database of the National Cancer Institute and National Toxicology Program (1976-1991). , 1991, Environmental health perspectives.

[2]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[3]  Mark Craven,et al.  Extracting comprehensible models from trained neural networks , 1996 .

[4]  J. Huff,et al.  Carcinogenesis Studies: Results of 398 Experiments on 104 Chemicals from the U. S. National Toxicology Program , 1988, Annals of the New York Academy of Sciences.

[5]  James J. Buckley,et al.  On the equivalence of neural networks and fuzzy expert systems , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[6]  R. Tennant,et al.  Definitive relationships among chemical structure, carcinogenicity and mutagenicity for 301 chemicals tested by the U.S. NTP. , 1991, Mutation research.

[7]  LiMin Fu,et al.  Rule Generation from Neural Networks , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[8]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[9]  Ashwin Srinivasan,et al.  The Predictive Toxicology Evaluation Challenge , 1997, IJCAI.

[10]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[11]  D. Bristol,et al.  The NIEHS Predictive-Toxicology Evaluation Project. , 1996, Environmental health perspectives.

[12]  Dennis Bahler,et al.  The Induction of Rules for Predicting Chemical Carcinogenesis in Rodents , 1993, ISMB.

[13]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[14]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[15]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[16]  D. Lewis,et al.  Comparison between rodent carcinogenicity test results of 44 chemicals and a number of predictive systems. , 1994, Regulatory toxicology and pharmacology : RTP.