论文信息 - On-Line Reinforcement Learning Using Cascade Constructive Neural Networks

On-Line Reinforcement Learning Using Cascade Constructive Neural Networks

In order to scale to problems with large or continuous state-spaces, reinforcement learning algorithms need to use function approximation. Neural networks are one commonly used approach, with most work so far using fixed-architecture networks. Previous supervised learning research has shown that constructive networks which grow their architecture during training outperform fixed-architecture networks. This paper extends the sarsa algorithm to use a cascade constructive network, and shows it outperforms a fixed-architecture network on two benchmark tasks.

Peter Vamplew | Robert Ollington | R. Ollington | P. Vamplew

[1] Lutz Prechelt,et al. Investigation of the CasCor Family of Learning Algorithms , 1997, Neural Networks.

[2] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[3] Jukka Saarinen,et al. Evaluation of constructive neural networks with cascaded architectures , 2002, Neurocomputing.

[4] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[5] Doina Precup,et al. Combining TD-learning with Cascade-correlation Networks , 2003, ICML.

[6] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.

[7] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[8] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[9] S. Waugh,et al. Function evaluation and the cascade-correlation architecture , 1995, Proceedings of ICNN'95 - International Conference on Neural Networks.

[10] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .

[11] Sebastian Thrun,et al. Issues in Using Function Approximation for Reinforcement Learning , 1999 .