论文信息 - Tower of Hanoi with Connectionist Networks: Learning New Features

Tower of Hanoi with Connectionist Networks: Learning New Features

A connectionist system previously used to solve the numerical control task of balancing a pole (Barto, Sutton, and Anderson, 1983; Anderson, 1987) is applied to a Tower of Hanoi puzzle. The connectionist system consists of two networks: an evaluation network that learns an evaluation function of states, and an action network that learns to select actions as a function of the puzzle's state and previous actions. The initial state representation is insufficient–new features must be learned to form a useful evaluation function. Comparisons of methodology are made with Langley's (1985) adaptive production system, SAGE.2.

Charles W. Anderson

[1] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .

[2] Pat Langley,et al. Learning to search : from weak methods to domain-specific heuristics , 1985 .

[3] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[4] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[5] Geoffrey E. Hinton,et al. Symbols Among the Neurons: Details of a Connectionist Inference Architecture , 1985, IJCAI.