Learning dynamics in stochastic routing games

We consider a repeated nonatomic network routing game where a large number of risk-neutral users are unsure about the edge latency functions and learn about them using past travel experience. We assume that the network has affine stochastic edge latency functions with unknown slope. We consider a simple process of learning where agents share common observations of travel times, estimate the unknown edge slope parameters via ordinary least squares and, at every step, dispatch their traffic demand over the network according to the Wardrop equilibrium computed using mean latency functions with most recent estimates. We prove that under this learning dynamics, the flow vector in the network converges almost surely to the full information Wardrop equilibrium. Moreover, the slope parameters of all the edges used in the full information Wardrop equilibrium are learned almost surely in the limit.

[1]  Avrim Blum,et al.  Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games , 2006, PODC '06.

[2]  C. B. Mcguire,et al.  Studies in the Economics of Transportation , 1958 .

[3]  J. G. Wardrop,et al.  Some Theoretical Aspects of Road Traffic Research , 1952 .

[4]  José R. Correa,et al.  A Geometric Approach to the Price of Anarchy in Nonatomic Congestion Games , 2008, Games Econ. Behav..

[5]  Alexandre M. Bayen,et al.  On the convergence of no-regret learning in selfish routing , 2014, ICML.

[6]  Evdokia Nikolova,et al.  Stochastic Selfish Routing , 2011, SAGT.

[7]  Anna Nagurney,et al.  Sensitivity analysis for the asymmetric network equilibrium problem , 1984, Math. Program..

[8]  Jean-Philippe Chancelier,et al.  Risk Aversion, Road Choice, and the One-Armed Bandit Problem , 2007, Transp. Sci..

[9]  Munther A. Dahleh,et al.  Stability analysis of transportation networks with multiscale driver decisions , 2011, Proceedings of the 2011 American Control Conference.

[10]  John B. Taylor Asymptotic Properties of Multiperiod Control Rules in the Linear Regression Model , 1974 .

[11]  Jason R. Marden,et al.  Joint Strategy Fictitious Play with Inertia for Potential Games , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[12]  Roberto Cominetti,et al.  Equilibrium routing under uncertainty , 2015, Math. Program..

[13]  Berthold Vöcking,et al.  On the Evolution of Selfish Routing , 2004, ESA.

[14]  Berthold Vöcking,et al.  Fast convergence to Wardrop equilibria by adaptive sampling methods , 2006, STOC '06.

[15]  F. Facchinei,et al.  Finite-Dimensional Variational Inequalities and Complementarity Problems , 2003 .

[16]  Santosh S. Vempala,et al.  Efficient algorithms for online decision problems , 2005, Journal of computer and system sciences (Print).

[17]  Alain Haurie,et al.  On the relationship between Nash - Cournot and Wardrop equilibria , 1983, Networks.

[18]  Asuman E. Ozdaglar,et al.  Network Games: Theory, Models, and Dynamics , 2011, Network Games: Theory, Models, and Dynamics.