Entropy Nets: From Decision Trees to Neural Networks

.A multiple-layer artificial network (ANN) structure is capable of implementing arbitrary input-output mappings. Similarly, hierarchical classifiers, more commonly known as decision trees, possess the capabilites of generating arbitrarily complex decision boundaries in an n-dimensional space. Given a decision tree, it is possible to restructure it as a multilayered neural network. The objective of this paper is to show how this mapping of decision trees into a multilayer neural network structure can be exploited for the systematic design of a class of layered neural networks, called entropy nets, that have far fewer connections. Several important issues such as the automatic tree generation, incorporation of incremental learning, and the generalization of knowledge acquired during the treedesign phase are discussed. Finally, a two-step methodology for designing entropy networks is presented. The advantages of this methodology are that it specifies the number of neurons needed in each layer, alongwith thedesired output. This leads to a faster progressive training procedure that allows each layer to be trained separately. Two examples are presented to show the success of neural network design through decision tree mapping.

[1]  James L. McClelland Explorations In Parallel Distributed Processing , 1988 .

[2]  Padhraic Smyth,et al.  Decision tree design from a communication theory standpoint , 1988, IEEE Trans. Inf. Theory.

[3]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[4]  Pramod K. Varshney,et al.  Application of information theory to the construction of efficient decision trees , 1982, IEEE Trans. Inf. Theory.

[5]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[6]  William S. Meisel,et al.  An Algorithm for Constructing Optimal Binary Decision Trees , 1977, IEEE Transactions on Computers.

[7]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[8]  Laveen N. Kanal,et al.  Problem-Solving Models and Search Strategies for Pattern Recognition , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[10]  Ishwar K. Sethi,et al.  Efficient decision tree design for discrete variable pattern recognition problems , 1977, Pattern Recognition.

[11]  Bernard Widrow,et al.  Layered neural nets for pattern recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..

[12]  David J. Burr,et al.  Experiments on neural net recognition of spoken and written text , 1988, IEEE Trans. Acoust. Speech Signal Process..

[13]  Amiel Feinstein,et al.  Transmission of Information. , 1962 .

[14]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[15]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory, Third Edition , 1989, Springer Series in Information Sciences.