Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Robust Optimization for Hybrid MDPs with State-Dependent Noise

Recent advances in solutions to Hybrid MDPs with discrete and continuous state and action spaces have significantly extended the class of MDPs for which exact solutions can be derived, albeit at the expense of a restricted transition noise model. In this paper, we work around limitations of previous solutions by adopting a robust optimization approach in which Nature is allowed to adversarially determine transition noise within pre-specified confidence intervals. This allows one to derive an optimal policy with an arbitrary (user-specified) level of success probability and significantly extends the class of transition noise models for which Hybrid MDPs can be solved. This work also significantly extends results for the related "chance-constrained" approach in stochastic hybrid control to accommodate state-dependent noise. We demonstrate our approach working on a variety of hybrid MDPs taken from AI planning, operations research, and control theory, noting that this is the first time robust solutions with strong guarantees over all states have been automatically derived for such problems.

[1]  Ronen I. Brafman,et al.  A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains , 2009, J. Artif. Intell. Res..

[2]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[3]  C. Guestrin,et al.  Solving Factored MDPs with Hybrid State and Action Variables , 2006, J. Artif. Intell. Res..

[4]  Michael L. Littman,et al.  Exact Solutions to Time-Dependent MDPs , 2000, NIPS.

[5]  Alessandro De Luca IEEE Transactions on Robotics and Automation: Editorial , 2003 .

[6]  Masahiro Ono,et al.  Chance-Constrained Optimal Path Planning With Obstacles , 2011, IEEE Transactions on Robotics.

[7]  Thomas A. Henzinger,et al.  HYTECH: a model checker for hybrid systems , 1997, International Journal on Software Tools for Technology Transfer.

[8]  S. Sastry,et al.  Towars a Theory of Stochastic Hybrid Systems , 2000, HSCC.

[9]  R. Bellman Dynamic programming. , 1957, Science.

[10]  Wpmh Maurice Heemels,et al.  Survey of modeling, analysis, and control of hybrid systems , 2009 .

[11]  William W.-G. Yeh,et al.  Reservoir Management and Operations Models: A State‐of‐the‐Art Review , 1985 .

[12]  John N. Tsitsiklis,et al.  Complexity of stability and controllability of elementary hybrid systems , 1999, Autom..

[13]  Pu Li,et al.  A probabilistically constrained model predictive controller , 2002, Autom..

[14]  Zhengzhu Feng,et al.  Dynamic Programming for Structured Continuous Markov Decision Problems , 2004, UAI.

[15]  Zhenyu Yang,et al.  A unified approach to controllability analysis for hybrid control systems , 2007 .

[16]  R. I. Bahar,et al.  Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[17]  R. Lathe Phd by thesis , 1988, Nature.

[18]  Craig Boutilier,et al.  Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[19]  Bo Egardt,et al.  Control design for integrator hybrid systems , 1998, IEEE Trans. Autom. Control..

[20]  Masahiro Ono,et al.  An Efficient Motion Planning Algorithm for Stochastic Dynamic Systems with Constraints on Probability of Failure , 2008, AAAI.

[21]  S. Sanner,et al.  Symbolic Dynamic Programming for Continuous State and Action MDPs , 2012, AAAI.

[22]  Milind Tambe,et al.  A Fast Analytical Algorithm for Solving Markov Decision Processes with Real-Valued Resources , 2007, IJCAI.

[23]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[24]  Michael Nikolaou,et al.  Chance‐constrained model predictive control , 1999 .

[25]  Bernhard Nebel,et al.  A Planning Based Framework for Controlling Hybrid Systems , 2012, ICAPS.

[26]  Scott Sanner,et al.  Symbolic Dynamic Programming for Discrete and Continuous State MDPs , 2011, UAI.

[27]  Lihong Li,et al.  Lazy Approximation for Solving Continuous Finite-Horizon MDPs , 2005, AAAI.

[28]  Masahiro Ono,et al.  A Probabilistic Particle-Control Approximation of Chance-Constrained Stochastic Predictive Control , 2010, IEEE Transactions on Robotics.

[29]  John Lygeros,et al.  Towars a Theory of Stochastic Hybrid Systems , 2000, HSCC.

[30]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[31]  Masoud Mahootchi Storage System Management Using Reinforcement Learning Techniques and Nonlinear Models , 2009 .