论文信息 - Minimising Regret in Route Choice

Minimising Regret in Route Choice

The use of reinforcement learning (RL) in multiagent scenarios is challenging. I consider the route choice problem, where drivers must choose routes that minimise their travel times. Here, selfish RL-agents must adapt to each others' decisions. In this work, I show how the agents can learn (with performance guarantees) by minimising the regret associated with their decisions, thus achieving the User Equilibrium (UE). Considering the UE is inefficient from a global perspective, I also focus on bridging the gap between the UE and the system optimum. In contrast to previous approaches, this work drops any full knowledge assumption.

Gabriel de Oliveira Ramos

[1] Kagan Tumer,et al. Unifying temporal and structural credit assignment problems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[2] Kevin Waugh,et al. Solving Games with Functional Regret Estimation , 2014, AAAI Workshop: Computer Poker and Imperfect Information.

[3] Ana L. C. Bazzan,et al. Learning to Minimise Regret in Route Choice , 2017, AAMAS.

[4] Baruch Awerbuch,et al. Adaptive routing with end-to-end feedback: distributed learning and geometric approaches , 2004, STOC '04.