The Reinforcement Learning Problem

This chapter contains sections titled: The Agent-Environment Interface, Goals and Rewards, Returns, Unified Notation for Episodic and Continuing Tasks, The Markov Property, Markov Decision Processes, Value Functions, Optimal Value Functions, Optimality and Approximation, Summary, Bibliographical and Historical Remarks