Compact Frequency Memory for Reinforcement Learning with Hidden States

Memory-based reinforcement learning approaches keep track of past experiences of the agent in environments with hidden states. This may require extensive use of memory that limits the practice of these methods in a real-life problem. The motivation behind this study is the observation that less frequent transitions provide more reliable information about the current state of the agent in ambiguous environments. In this work, a selective memory approach based on the frequencies of transitions is proposed to avoid keeping the transitions which are unrelated to the agent’s current state. Experiments show that the usage of a compact and selective memory may improve and speed up the learning process.