The Reinforcement Learning Problem
暂无分享,去创建一个
This chapter contains sections titled: The Agent-Environment Interface, Goals and Rewards, Returns, Unified Notation for Episodic and Continuing Tasks, The Markov Property, Markov Decision Processes, Value Functions, Optimal Value Functions, Optimality and Approximation, Summary, Bibliographical and Historical Remarks
[1] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..