Solving Partially Observable Reinforcement Learning Problems with Recurrent Neural Networks
暂无分享,去创建一个
[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[2] R. Bellman. Dynamic programming. , 1957, Science.
[3] F. Takens. Detecting strange attractors in turbulence , 1981 .
[4] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[5] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[6] Geoffrey E. Hinton,et al. Learning representations by back-propagation errors, nature , 1986 .
[7] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .
[8] Jirí Benes,et al. On neural networks , 1990, Kybernetika.
[9] A. P. Wieland,et al. Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[10] Michael C. Mozer,et al. Induction of Multiscale Temporal Structure , 1991, NIPS.
[11] Giovanni Soda,et al. Local Feedback Multilayered Networks , 1992, Neural Computation.
[12] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[13] Ryszard Tadeusiewicz,et al. Neural networks: A comprehensive foundation: by Simon HAYKIN; Macmillan College Publishing, New York, USA; IEEE Press, New York, USA; IEEE Computer Society Press, Los Alamitos, CA, USA; 1994; 696 pp.; $69–95; ISBN: 0-02-352761-7 , 1995 .
[14] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[15] Ralph Neuneier,et al. How to Train Neural Networks , 1996, Neural Networks: Tricks of the Trade.
[16] Peter Tiño,et al. Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.
[17] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[18] Risto Miikkulainen,et al. 2-D Pole Balancing with Recurrent Evolutionary Networks , 1998 .
[19] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .
[20] Alex M. Andrew,et al. ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).
[21] Lakhmi C. Jain,et al. Recurrent Neural Networks: Design and Applications , 1999 .
[22] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[23] Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..
[24] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[25] John F. Kolen,et al. Field Guide to Dynamical Recurrent Networks , 2001 .
[26] John F. Kolen,et al. Neural Network Architectures for the Modeling of Dynamic Systems , 2001 .
[27] Ralph Neuneier,et al. Modeling Dynamical Systems by Error Correction Neural Networks , 2002 .
[28] Risto Miikkulainen,et al. Robust non-linear control through neuroevolution , 2003 .
[29] Faustino J. Gomez,et al. PhD Thesis: Robust Non-Linear Control through Neuroevolution , 2003 .
[30] Pat Langley,et al. Editorial: On Machine Learning , 1986, Machine Learning.
[31] Jennie Si,et al. Supervised ActorCritic Reinforcement Learning , 2004 .
[32] Pieter Bram Bakker,et al. The state of mind : reinforcement learning with recurrent neural networks , 2004 .
[33] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[34] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[35] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[36] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[37] Terrence J. Sejnowski,et al. New Directions in Statistical Signal Processing: From Systems to Brains (Neural Information Processing) , 2006 .
[38] Marc Toussaint,et al. Extracting Motion Primitives from Natural Handwriting Data , 2006, ICANN.
[39] Hans-Georg Zimmermann,et al. Recurrent Neural Networks Are Universal Approximators , 2006, ICANN.
[40] Steffen Udluft,et al. A Neural Reinforcement Learning Approach to Gas Turbine Control , 2007, 2007 International Joint Conference on Neural Networks.
[41] Steffen Udluft,et al. The Recurrent Control Neural Network , 2007, ESANN.
[42] Thomas Martinetz,et al. Neural Rewards Regression for near-optimal policy identification in Markovian and partial observable environments , 2007, ESANN.
[43] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[44] Daniel Schneegaß,et al. Steigerung der Informationseffizienz im Reinforcement-Learning , 2008 .
[45] Martin A. Riedmiller,et al. The Neuro Slot Car Racer: Reinforcement Learning in a Real World Setting , 2009, 2009 International Conference on Machine Learning and Applications.
[46] Simon Haykin,et al. Neural Networks and Learning Machines , 2010 .
[47] Steffen Udluft,et al. The Markov Decision Process Extraction Network , 2010, ESANN.
[48] Dan Roth,et al. Knowledge and ignorance in reinforcement learning , 2011 .
[49] Martin A. Riedmiller. 10 Steps and Some Tricks to Set up Neural Reinforcement Controllers , 2012, Neural Networks: Tricks of the Trade.
[50] Ralph Neuneier,et al. How to Train Neural Networks , 2012, Neural Networks: Tricks of the Trade.
[51] Grgoire Montavon,et al. Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.
[52] Michael T. Rosenstein,et al. Supervised Actor‐Critic Reinforcement Learning , 2012 .
[53] Steffen Udluft,et al. Recurrent Neural State Estimation in Domains with Long-Term Dependencies , 2012, ESANN.
[54] Hans-Georg Zimmermann,et al. Forecasting with Recurrent Neural Networks: 12 Tricks , 2012, Neural Networks: Tricks of the Trade.