Model-Based or Model-Free, a Review of Approaches in Reinforcement Learning

Reinforcement learning (RL) algorithms can successfully solve a wide range of problems that we faced. Because of the Alpha Go against KeJie in 2017, the topic of RL has reached the completed new level of public opinion. Usually, reinforcement learning includes two categories, model-based method and model-free method, each of which shows unique advantages. Model-free RL can successfully solve various tasks, which can play video games and solve robotic tasks, but requires many samples to realize good performance. Model-based RL can quickly obtain near-optimal control by learning the model in a rather limited class of dynamics. In this situation, knowledge about the environment can be acquired in an unsupervised setting, even in trajectories where no rewards are available. However, its disadvantages lie in that most modelbased algorithms learn local models over-fitting several samples by depending on simple functional approximators, usually one mini-batch. The main body of this paper is going to summarize the classic algorithms in RL. Also, in the discussion part, new approaches are discussed, to keep the strength in model-based and model-free algorithms.