Probabilistic Loss and its Online Characterization for Simplified Decision Making Under Uncertainty

It is a long-standing objective to ease the computation burden incurred by the decision making process. Identification of this mechanism’s sensitivity to simplification has tremendous ramifications. Yet, algorithms for decision making under uncertainty usually lean on approximations or heuristics without quantifying their effect. Therefore, challenging scenarios could severely impair the performance of such methods. In this paper, we extend the decision making mechanism to the whole by removing standard approximations and considering all previously suppressed stochastic sources of variability. On top of this extension, our key contribution is a novel framework to simplify decision making while assessing and controlling online the simplification’s impact. Furthermore, we present novel stochastic bounds on the return and characterize online the effect of simplification using this framework on a particular simplification technique reducing the number of samples in belief representation for planning. Finally, we verify the advantages of our approach through extensive simulations.

[1]  Vadim Indelman,et al.  Topological Information-Theoretic Belief Space Planning with Optimality Guarantees , 2019, ArXiv.

[2]  David Hsu,et al.  SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[3]  Mykel J. Kochenderfer,et al.  Efficient Decision-Theoretic Target Localization , 2017, ICAPS.

[4]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[5]  Hans Driessen,et al.  Particle filter based entropy , 2010, 2010 13th International Conference on Information Fusion.

[6]  Joel Veness,et al.  Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[7]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[8]  Simplified decision making in the belief space using belief sparsification , 2019, 1909.00885.

[9]  Vadim Indelman No Correlations Involved: Decision Making Under Uncertainty in a Conservative Sparse Information Space , 2016, IEEE Robotics and Automation Letters.

[10]  Jonathan P. How,et al.  Decision Making Under Uncertainty: Theory and Application , 2015 .

[11]  Vadim Indelman,et al.  iX-BSP: Belief Space Planning through Incremental Expectation , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[12]  David Hsu,et al.  DESPOT-Alpha: Online POMDP Planning with Large State and Observation Spaces , 2019, Robotics: Science and Systems.

[13]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[14]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[15]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[16]  David Hsu,et al.  Integrated perception and planning in the continuous space: A POMDP approach , 2013, Int. J. Robotics Res..

[17]  Louis Wehenkel,et al.  Risk-aware decision making and dynamic programming , 2008 .

[18]  Frank Dellaert,et al.  Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments , 2015, Int. J. Robotics Res..

[19]  M. Bouakiz,et al.  Target-level criterion in Markov decision processes , 1995 .

[20]  Mykel J. Kochenderfer,et al.  Online Algorithms for POMDPs with Continuous State, Action, and Observation Spaces , 2017, ICAPS.

[21]  Vadim Indelman,et al.  Fast Action Elimination for Efficient Decision Making and Belief Space Planning Using Bounded Approximations , 2017, ISRR.

[22]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[23]  Congbin Wu,et al.  Minimizing risk models in Markov decision processes with policies depending on target values , 1999 .

[24]  Olivier Buffet,et al.  rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions , 2018, NeurIPS.

[25]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[26]  Vadim Indelman,et al.  iX-BSP: Incremental Belief Space Planning , 2021, ArXiv.

[27]  Pedro U. Lima,et al.  Decision-theoretic planning under uncertainty with information rewards for active cooperative perception , 2014, Autonomous Agents and Multi-Agent Systems.

[28]  Allison Ryan Information-Theoretic Tracking Control Based on Particle Filter Estimate , 2008 .

[29]  R. Rockafellar,et al.  Optimization of conditional value-at risk , 2000 .

[30]  Ömer Sahin Tas,et al.  Information Particle Filter Tree: An Online Algorithm for POMDPs with Belief-Based Rewards on Continuous Domains , 2020, ICML.

[31]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[32]  F. Gustafsson,et al.  On information measures based on particle mixture for optimal bearings-only tracking , 2009, 2009 IEEE Aerospace conference.

[33]  David Hsu,et al.  DESPOT: Online POMDP Planning with Regularization , 2013, NIPS.

[34]  Hanna Kurniawati,et al.  An Online POMDP Solver for Uncertainty Planning in Dynamic Environment , 2013, ISRR.