Robot Learning in Partially Observable, Noisy, Continuous Worlds

Partially-observable Markov decision problems (POMDPs) pose special difficulties for the task of learning robot control policies, due to the need to disambiguate perceptually aliased states. Short-term memories of recent actions and/or percepts are required to provide context for the robot to perform such disambiguation. We introduce Variable-Resolution Percept Discretization (VRPD) as an extension to Utile Suffix Memory (USM), an algorithm designed to solve discrete POMDPs. This extension allows USM to function effectively in noisy, continuous worlds. We describe the extension in detail, then we demonstrate experimentally the improvements that it makes to USM in the context of continuous POMDPs.