Knowledge extraction from trained neural networks: a position paper

It is commonly accepted that one of the main drawbacks of neural networks, the lack of explanation, may be ameliorated by the so called rule extraction methods. We argue that neural networks encode nonmonotonicity, i.e., they jump to conclusions that might be withdrawn when new information is available. The authors present an extraction method that complies with the above perspective. We define a partial ordering on the network's input vector set, and use it to confine the search space for the extraction of rules by querying the network. We then define a number of simplification metarules, show that the extraction is sound and present the results of applying the extraction algorithm to the Monks' Problems (S.B. Thrun et al., 1991).