Actor-critic learning based on fuzzy inference system

Actor-critic learning is a reinforcement learning method used to find an optimal agent behavior. The only information available for learning is the system feedback (reward/punishment). Initially, this method was analyzed for discrete states and actions. Functions were then approximated by lookup tables. Most of the real problems have large input spaces and/or continuous actions. So, other function approximators have to be used to introduce generalization. The actor-critic learning presented in this paper uses a fuzzy inference system (FIS) to generalize between states having the same fuzzy properties and between actions (continuous action case). The use of FIS rather than global function approximators like neural networks has two major advantages: the FIS inherent locality property permits the introduction of human knowledge, and it also localizes the learning process to only implicated parameters.