Adaptive exploration in reinforcement learning

The exploration/exploitation trade-off is a difficult problem for a reinforcement learning agent. A non-stationary environment coupled with current connectionist implementations of reinforcement learning algorithms is a recipe for disaster. Towards a solution for such situations we introduce a novel technique, called past-success directed exploration, and an implementation of reinforcement learning algorithms based on the fuzzy ARTMAP architecture. We compare through experimentation features of a traditional approach with our own.