The Recurrent Control Neural Network

This paper presents our Recurrent Control Neural Network (RCNN), which is a model-based approach for a data-efficient modelling and control of reinforcement learning problems in discrete time. Its architecture is based on a recurrent neural network (RNN), which is extended by an additional control network. The latter has the particular task to learn the optimal policy. This method has the advantage that by using neural networks we can easily deal with high-dimensions or continuous state and action spaces. Furthermore we can profit from their high systemidentification and approximation quality. We show that our RCNN is able to learn a potentially optimal policy by testing it on two different settings of the mountain car problem.