Making the World Di erentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning
暂无分享,去创建一个
[1] M. Gherrity,et al. A learning algorithm for analog, fully recurrent neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[2] Teuvo Kohonen,et al. Self-Organization and Associative Memory , 1988 .
[3] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[4] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .
[5] H. Franke,et al. Ästhetik als Informationsverarbeitung , 1974 .
[6] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[7] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[8] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[9] Barak A. Pearlmutter. Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.
[10] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[11] B. Widrow,et al. The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[12] Frank Fallside,et al. Dynamic reinforcement driven error propagation networks with application to game playing , 1989 .
[13] Ronald J. Williams,et al. Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .
[14] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[15] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[16] Jürgen Schmidhuber,et al. A local learning algorithm for dynamic feedforward and recurrent networks , 1990, Forschungsberichte, TU Munich.
[17] Jürgen Schmidhuber,et al. Reinforcement Learning with Interacting Continually Running Fully Recurrent Networks , 1990 .
[18] P. J. Werbos,et al. Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.
[19] P. Werbos,et al. Expectation Driven Learning with an Associative Memory , 1990 .
[20] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[21] Jürgen Schmidhuber,et al. Recurrent networks adjusted by adaptive critics , 1990 .
[22] Jürgen Schmidhuber,et al. Networks adjusting networks , 1990, Forschungsberichte, TU Munich.
[23] Michael I. Jordan,et al. Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.
[24] T. Sejnowski,et al. Learning Algorithms for Networks with Internal and External Feedback , 1990 .
[25] Michael I. Jordan. Supervised learning and systems with excess degrees of freedom , 1988 .
[26] R. J. Williams,et al. On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.
[27] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[28] Jürgen Schmidhuber,et al. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[29] S. Piche,et al. First-Order Gradient Descent Training of Adaptive Discrete-Time Dynamic Networks , 1991 .