Approximately Optimal Risk-Averse Routing Policies via Adaptive Discretization

Mitigating risk in decision-making has been a longstanding problem. Due to the mathematical challenge of its nonlinear nature, especially in adaptive decisionmaking problems, finding optimal policies is typically intractable. With a focus on efficient algorithms, we ask how well we can approximate the optimal policies for the difficult case of general utility models of risk. Little is known about efficient algorithms beyond the very special cases of linear (risk-neutral) and exponential utilities since general utilities are not separable and preclude the use of traditional dynamic programming techniques. In this paper, we consider general utility functions and investigate efficient computation of approximately optimal routing policies, where the goal is to maximize the expected utility of arriving at a destination around a given deadline. We present an adaptive discretization variant of successive approximation which gives an e-optimal policy in polynomial time. The main insight is to perform discretization at the utility level space, which results in a nonuniform discretization of the domain, and applies for any monotone utility function.

[1]  Sebastien Blandin,et al.  A Tractable Class of Algorithms for Reliable Routing in Stochastic Networks , 2011 .

[2]  Nicole Bäuerle,et al.  More Risk-Sensitive Markov Decision Processes , 2014, Math. Oper. Res..

[3]  David R. Karger,et al.  Route Planning under Uncertainty: The Canadian Traveller Problem , 2008, AAAI.

[4]  Bart Selman,et al.  Probabilistic planning with non-linear utility functions and worst-case guarantees , 2012, AAMAS.

[5]  Peter C. Nelson,et al.  Reliable route guidance: A case study from Chicago , 2012 .

[6]  Y. Nie,et al.  Arriving-on-time problem : Discrete algorithm that ensures convergence , 2006 .

[7]  John N. Tsitsiklis,et al.  Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.

[8]  R. Howard,et al.  Risk-Sensitive Markov Decision Processes , 1972 .

[9]  Leslie Pack Kaelbling,et al.  Planning With Deadlines in Stochastic Domains , 1993, AAAI.

[10]  Takayuki Osogami,et al.  Iterated risk measures for risk-sensitive Markov decision processes with discounted cost , 2011, UAI.

[11]  Sven Koenig,et al.  An exact algorithm for solving MDPs under risk-sensitive planning objectives with one-switch utility functions , 2008, AAMAS.

[12]  J. Tsitsiklis,et al.  Robust, risk-sensitive, and data-driven control of markov decision processes , 2007 .

[13]  Shie Mannor,et al.  Probabilistic Goal Markov Decision Processes , 2011, IJCAI.

[14]  Bart Selman,et al.  Risk-Sensitive Policies for Sustainable Renewable Resource Allocation , 2011, IJCAI.

[15]  Congbin Wu,et al.  Minimizing risk models in Markov decision processes with policies depending on target values , 1999 .

[16]  Alexandre M. Bayen,et al.  Speedup Techniques for the Stochastic on-time Arrival Problem , 2012, ATMOS.

[17]  Mihalis Yannakakis,et al.  Shortest Paths Without a Map , 1989, Theor. Comput. Sci..

[18]  Yueyue Fan,et al.  Optimal Routing for Maximizing the Travel Time Reliability , 2006 .

[19]  Solomon Eyal Shimony,et al.  Complexity of Canadian Traveler Problem Variants , 2012, Theor. Comput. Sci..

[20]  Sven Koenig,et al.  Functional Value Iteration for Decision-Theoretic Planning with General Utility Functions , 2006, AAAI.

[21]  Sven Koenig,et al.  Risk-Sensitive Planning with One-Switch Utility Functions: Value Iteration , 2005, AAAI.

[22]  Andrzej Ruszczynski,et al.  Risk-averse dynamic programming for Markov decision processes , 2010, Math. Program..

[23]  Lihong Li,et al.  Lazy Approximation for Solving Continuous Finite-Horizon MDPs , 2005, AAAI.