Initialization of neural networks by means of decision trees

The performance of neural networks is known to be sensitive to the initial weight setting and architecture (the number of hidden layers and neurons in these layers). This shortcoming can be alleviated if some approximation of the target concept in terms of a logical description is available. The paper reports a successful attempt to initialize neural networks using decision-tree generators. The TBNN (tree-based neural net) system compares very favourably with other learners in terms of classification accuracy for unseen data, and it is also computationally less demanding than the back propagation algorithm applied to a randomly initialized multilayer perceptron. The behavior of the system is first studied for specially designed artificial data. Then, its performance is demonstrated by a real-world application.

[1]  Richard P. Brent,et al.  Fast training algorithms for multilayer neural nets , 1991, IEEE Trans. Neural Networks.

[2]  LiMin Fu,et al.  Neural networks in computer intelligence , 1994 .

[3]  Jude Shavlik,et al.  Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.

[4]  J. R. Quinlan Probabilistic decision trees , 1990 .

[5]  G Pfurtscheller,et al.  Towards Automated Sleep Classification in Infants Using Symbolic and Subsymbolic Approaches - Automatische Schlafklassifikation bei Säuglingen mit Hilfe von symbolischen und subsymbolischen Methoden , 1993, Biomedizinische Technik. Biomedical engineering.

[6]  Nada Lavrac,et al.  The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains , 1986, AAAI.

[7]  Hans-Jürgen Zimmermann,et al.  Fuzzy Set Theory - and Its Applications , 1985 .

[8]  Sholom M. Weiss,et al.  An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods , 1989, IJCAI.

[9]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[10]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  Ryszard S. Michalski,et al.  Learning flexible concepts: fundamental ideas and a method based on two-tiered representation , 1990 .

[13]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[14]  Youngtae Park A mapping from linear tree classifiers to neural net classifiers , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[15]  Albert Y. Zomaya,et al.  Toward generating neural network structures for function approximation , 1994, Neural Networks.

[16]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[17]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[18]  Ishwar K. Sethi,et al.  Entropy nets: from decision trees to neural networks , 1990, Proc. IEEE.

[19]  Padhraic Smyth,et al.  Rule-Based Neural Networks for Classification and Probability Estimation , 1992, Neural Computation.

[20]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[21]  Marcus Frean,et al.  The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.