Multi-timescale nexting in a reinforcement learning robot
暂无分享,去创建一个
[1] A. Clark. Whatever next? Predictive brains, situated agents, and the future of cognitive science. , 2013, The Behavioral and brain sciences.
[2] Patrick M. Pilarski,et al. Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).
[3] Richard S. Sutton,et al. Scaling life-long off-policy learning , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[4] Richard S. Sutton,et al. Beyond Reward: The Problem of Knowledge and Data , 2011, ILP.
[5] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[6] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..
[7] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[8] Jun Tani,et al. Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..
[9] Giovanni Pezzulo,et al. Coordinating with the Future: The Anticipatory Nature of Representation , 2008, Minds and Machines.
[10] Geoffrey W. Sutton. Stumbling on Happiness , 2008 .
[11] C. Stevens,et al. Sweet Anticipation: Music and the Psychology of Expectation, by David Huron . Cambridge, Massachusetts: MIT Press, 2006 , 2007 .
[12] Sebastian Thrun,et al. Stanley: The robot that won the DARPA Grand Challenge , 2006, J. Field Robotics.
[13] Steven M. LaValle,et al. Planning algorithms , 2006 .
[14] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.
[15] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[16] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[17] Rick Grush,et al. The emulation theory of representation: Motor control, imagery, and perception , 2004, Behavioral and Brain Sciences.
[18] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[19] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[20] Paul R. Cohen,et al. A Method for Clustering the Experiences of a Mobile Robot that Accords with Human Judgments , 2000, AAAI/IAAI.
[21] K. Carlsson,et al. Tickling Expectations: Neural Processing in Anticipation of a Sensory Stimulus , 2000, Journal of Cognitive Neuroscience.
[22] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[23] Michael I. Jordan,et al. An internal model for sensorimotor integration. , 1995, Science.
[24] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[25] Benjamin Kuipers,et al. Map Learning with Uninterpreted Sensors and Effectors , 1995, Artif. Intell..
[26] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[27] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[28] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[29] R. Rescorla. Simultaneous and successive associations in sensory preconditioning. , 1980, Journal of experimental psychology. Animal behavior processes.
[30] P. Young,et al. Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.
[31] W. Brogden. Sensory pre-conditioning. , 1939 .
[32] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .
[33] R. Sutton,et al. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010 .
[34] R. Sutton. The Grand Challenge of Predictive Empirical Abstract Knowledge , 2009 .
[35] F. Kaplan,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Trans. Evol. Comput..
[36] D. Levitin. This Is Your Brain on Music , 2006 .
[37] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[38] Jean-Arcady Meyer,et al. Adaptive Behavior , 2005 .
[39] J. Hawkins,et al. On Intelligence , 2004 .
[40] Sebastian Thrun,et al. Online simultaneous localization and mapping with detection and tracking of moving objects: theory and results from a ground vehicle in crowded urban areas , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).
[41] J. L. Roux. An Introduction to the Kalman Filter , 2003 .
[42] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[43] Gary L. Drescher,et al. Made-up minds - a constructivist approach to artificial intelligence , 1991 .
[44] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .
[45] Lennart Ljung,et al. System Identification: Theory for the User , 1987 .
[46] Michael Cunningham. Intelligence: Its Organization and Development , 1972 .
[47] F. W. Irwin. Purposive Behavior in Animals and Men , 1932, The Psychological Clinic.