Knowledge extraction from neural networks

In the past, neural networks have been viewed as classification and regression systems whose internal representations were incomprehensible. It is now becoming apparent that algorithms can be designed which extract comprehensible representations from trained neural networks, enabling them to be used for data mining, i.e. the discovery and explanation of previously unknown relationships present in data. This paper reviews existing algorithms for extracting comprehensible representations from neural networks and describes research to generalize and extend the capabilities of one of these algorithms. The algorithm has been generalized for application to bioinformatics datasets, including the prediction of splice site junctions in human DNA sequences. Results generated on this dataset are compared with those generated by a conventional data mining technique (C5), and conclusions are drawn regarding the application of the neural network based technique to other fields of interest.

[1]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[2]  David Haussler,et al.  A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA , 1996, ISMB.

[3]  Guido Bologna,et al.  Lessons from Past, Current Issues, and Future Research Directions in Extracting the Knowledge Embedded in Artificial Neural Networks , 1998, Hybrid Neural Systems.

[4]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[5]  Antony Browne,et al.  Connectionist inference models , 2001, Neural Networks.

[6]  M. Chalmers V. Conclusions , 1986 .

[7]  Guido Bologna,et al.  Symbolic Rule Extraction from the DIMLP Neural Network , 1998, Hybrid Neural Systems.

[8]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Peter Géczy,et al.  Rule Extraction from Trained Artificial Neural Networks , 1997, ICONIP.

[11]  Harris Drucker,et al.  Boosting Performance in Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[12]  Jude W. Shavlik,et al.  Understanding Time-Series Networks: A Case Study in Rule Extraction , 1997, Int. J. Neural Syst..

[13]  L Roberts,et al.  GRAIL seeks out genes buried in DNA sequence. , 1991, Science.

[14]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[15]  Thangavel Alphonse Thanaraj A clean data set of EST-confirmed splice sites from Homo sapiens and standards for clean-up procedures , 1999, Nucleic Acids Res..

[16]  Ying Xu,et al.  Gene Prediction by Pattern Recognition and Homology Search , 1996, ISMB.

[17]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[18]  Xin Chen,et al.  Finding Genes in DNA Using Decision Trees and Dynamic Programming , 1996, ISMB.

[19]  Tetsushi Yada,et al.  Gene Recognition in Cyanobacterium Genomic Sequence Data Using the Hidden Markov Model , 1996, ISMB.

[20]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[21]  C.E. Shannon,et al.  Communication in the Presence of Noise , 1949, Proceedings of the IRE.

[22]  Bruce Lowerre,et al.  The Harpy speech understanding system , 1990 .

[23]  Marilyn Lougher Vaughn Derivation of the multilayer perceptron weight constraints for direct network interpretation and knowledge discovery , 1999, Neural Networks.

[24]  T A Thanaraj,et al.  Positional characterisation of false positives from computational prediction of human splice sites. , 2000, Nucleic acids research.

[25]  Antony Browne,et al.  Multistage Neural Network Ensembles , 2002, Multiple Classifier Systems.

[26]  Noel E. Sharkey,et al.  Combining diverse neural nets , 1997, The Knowledge Engineering Review.

[27]  Grace Jordison Molecular Biology of the Gene , 1965, The Yale Journal of Biology and Medicine.