Symbolic and Neural Learning Algorithms: An Experimental Comparison

Despite the fact that many symbolic and neural network (connectionist) learning algorithms address the same problem of learning from classified examples, very little is known regarding their comparative strengths and weaknesses. Experiments comparing the ID3 symbolic learning algorithm with the perception and backpropagation neural learning algorithms have been performed using five large, real-world data sets. Overall, backpropagation performs slightly better than the other two algorithms in terms of classification accuracy on new examples, but takes much longer to train. Experimental results suggest that backpropagation can work significantly better on data sets containing numerical data. Also analyzed empirically are the effects of (1) the amount of training data, (2) imperfect training examples, and (3) the encoding of the desired outputs. Backpropagation occasionally outperforms the other two systems when given relatively small amounts of training data. It is slightly more accurate than ID3 when examples are noisy or incompletely specified. Finally, backpropagation more effectively utilizes a “distributed” output encoding.

[1]  Joseph Levine,et al.  On the proper treatment of the connection between connectionism and symbolism , 1988, Behavioral and Brain Sciences.

[2]  Ryszard S. Michalski,et al.  An Experimental Comparison of Symbolic and Subsymbolic Learning Paradigms: Phase I-Learning Logic-st , 1991 .

[3]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[4]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[5]  Paul E. Utgoff,et al.  Perceptron Trees : A Case Study in ybrid Concept epresentations , 1999 .

[6]  Timur Ash,et al.  Dynamic node creation in backpropagation networks , 1989 .

[7]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[8]  R. Bareiss Exemplar-Based Knowledge Acquisition , 1989 .

[9]  J. Ross Quinlan,et al.  Unknown Attribute Values in Induction , 1989, ML.

[10]  Raymond J. Mooney,et al.  Processing Issues in Comparisons of Symbolic and Connectionist Learning Systems , 1989, ML.

[11]  Robert E. Reinke,et al.  Knowledge Acquisition and Refinement Tools for the ADVISE Meta-Expert System , 1984 .

[12]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[13]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[14]  Wray L. Buntine Decision tree induction systems: A Bayesian analysis , 1987, Int. J. Approx. Reason..

[15]  J. Fodor,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[16]  Ivan Bratko,et al.  Experiments in automatic learning of medical diagnostic rules , 1984 .

[17]  J. Ross Quinlan,et al.  Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[18]  John Mingers,et al.  An Empirical Comparison of Pruning Methods for Decision Tree Induction , 1989, Machine Learning.

[19]  Thomas G. Dietterich,et al.  A Comparative Study of ID3 and Backpropagation for English Text-to-Speech Mapping , 1990, ML.

[20]  R. Detrano,et al.  International application of a new probability algorithm for the diagnosis of coronary artery disease. , 1989, The American journal of cardiology.

[21]  Douglas H. Fisher,et al.  An Empirical Comparison of ID3 and Back-propagation , 1989, IJCAI.

[22]  Terrence J. Sejnowski,et al.  Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..

[23]  Ronald A. Cole,et al.  A neural-net training program based on conjugate-radient optimization , 1989 .

[24]  P. Smolensky On the proper treatment of connectionism , 1988, Behavioral and Brain Sciences.

[25]  J. Orbach Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[26]  Temple F. Smith Occam's razor , 1980, Nature.

[27]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[28]  C. V. D. Malsburg,et al.  Frank Rosenblatt: Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms , 1986 .

[29]  Robert J. Marks,et al.  Performance Comparisons Between Backpropagation Networks and Classification Trees on Three Real-World Applications , 1989, NIPS.

[30]  Douglas H. Fisher,et al.  A Case Study of Incremental Concept Induction , 1986, AAAI.

[31]  Alen D. Shapiro,et al.  Structured induction in expert systems , 1987 .

[32]  Marvin Minsky,et al.  Perceptrons: expanded edition , 1988 .

[33]  Raymond J. Mooney,et al.  An Experimental Comparison of Symbolic and Connectionist Learning Algorithms , 1989, IJCAI.

[34]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[35]  Ryszard S. Michalski,et al.  A theory and methodology of inductive learning , 1993 .

[36]  Douglas H. Fisher,et al.  Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[37]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[38]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[39]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[40]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[41]  W. Freeman Second Commentary: On the proper treatment of connectionism by Paul Smolensky (1988) - Neuromachismo Rekindled , 1989 .

[42]  James L. McClelland,et al.  Explorations in parallel distributed processing: a handbook of models, programs, and exercises , 1988 .

[43]  S. Pinker,et al.  Connections and symbols , 1988 .

[44]  J. R. Quinlan DECISION TREES AS PROBABILISTIC CLASSIFIERS , 1987 .

[45]  Jason Catlett,et al.  Experiments on the Costs and Benefits of Windowing in ID3 , 1988, ML.

[46]  Lorien Y. Pratt,et al.  Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.

[47]  Saburo Muroga,et al.  Logic design and switching theory , 1979 .

[48]  Terrence J. Sejnowski,et al.  A Parallel Network that Learns to Play Backgammon , 1989, Artif. Intell..

[49]  James L. McClelland Resource requirements of standard and programmable nets , 1986 .

[50]  Michael G. Dyer,et al.  A Comparison of Concept Identification in Human Learning and Network Learning with the Generalized Delta Rule , 1987, IJCAI.

[51]  J. Stephen Judd,et al.  On the complexity of loading shallow neural networks , 1988, J. Complex..

[52]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[53]  Jude Shavlik,et al.  Refinement ofApproximate Domain Theories by Knowledge-Based Neural Networks , 1990, AAAI.

[54]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[55]  Jie Cheng,et al.  Improved Decision Trees: A Generalized Version of ID3 , 1988, ML.

[56]  Sholom M. Weiss,et al.  An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods , 1989, IJCAI.

[57]  Robert C. Holte,et al.  Concept Learning and the Problem of Small Disjuncts , 1989, IJCAI.

[58]  Robert Earl Stepp,et al.  Conjunctive Conceptual Clustering: A Methodology and Experimentation , 1987 .

[59]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[60]  Larry A. Rendell,et al.  Improving the design of similarity-based rule-learning systems , 1989 .

[61]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[62]  Sholom M. Weiss,et al.  Optimizing the Predictive Value of Diagnostic Decision Rules , 1987, AAAI.