Better Generalization with Forecasts
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Using Predictive Representations to Improve Generalization in Reinforcement Learning , 2005, IJCAI.
[2] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[3] Richard S. Sutton,et al. Temporal Abstraction in Temporal-difference Networks , 2005, NIPS.
[4] Patrick M. Pilarski,et al. Acquiring a broad range of empirical knowledge in real time by temporal-difference learning , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).
[5] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[6] R. Sutton,et al. Acquiring Diverse Predictive Knowledge in Real Time by Temporal-difference Learning , 2012 .
[7] Mark B. Ring. Learning Sequential Tasks by Incrementally Adding Higher Orders , 1992, NIPS.
[8] Herbert Jaeger,et al. Observable Operator Models for Discrete Stochastic Time Series , 2000, Neural Computation.
[9] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.
[10] R. Sutton,et al. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010 .
[11] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[12] M. V. Rossum,et al. In Neural Computation , 2022 .
[13] Thomas G. Dietterich,et al. In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.
[14] Koby Crammer,et al. Advances in Neural Information Processing Systems 14 , 2002 .
[15] Thomas Degris,et al. Scaling-up Knowledge for a Cognizant Robot , 2012, AAAI Spring Symposium: Designing Intelligent Robots.
[16] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[17] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[18] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[19] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[20] Richard S. Sutton,et al. GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.
[21] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[22] Ronald L. Rivest,et al. Diversity-based inference of finite automata , 1994, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[23] International Foundation for Autonomous Agents and MultiAgent Systems ( IFAAMAS ) , 2007 .
[24] Frans M. J. Willems,et al. The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.