Bidding in reinforcement learning: a paradigm for multi-agent systems

The paper presents an approach for developing multi-agent reinforcement learning systems that are made up of a coalition of modular agents. We focus on learning to segment sequences (sequential decision tasks) to create modular structures, through a bidding process that is based on reinforcements received during task execution. The approach segments sequences (and divides them up among agents) to facilitate the learning of the overall task. Notably, our approach does not rely on a priori knowledge or a priori structures. Initial experiments demonstrated the basic promise of the approach. This work shows how bidding and reinforcement learning can be usefully combined, thus pointing to a new research direction.