Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes
暂无分享,去创建一个
Nils Jansen | Ufuk Topcu | Jie Fu | Ralf Wimmer | Steven Carr | U. Topcu | Jie Fu | N. Jansen | Steven Carr | Ralf Wimmer
[1] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[2] Sebastian Junges,et al. A Storm is Coming: A Modern Probabilistic Model Checker , 2017, CAV.
[3] Krishnendu Chatterjee,et al. Trading memory for randomness , 2004, First International Conference on the Quantitative Evaluation of Systems, 2004. QEST 2004. Proceedings..
[4] Alan K. Mackworth,et al. Artificial Intelligence - Foundations of Computational Agents , 2010 .
[5] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[6] Sebastian Junges,et al. Motion planning under partial observability using game-based abstraction , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).
[7] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[8] Peter Beike. Cognitive Models Of Memory , 2016 .
[9] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[10] Emilio Frazzoli,et al. Control of probabilistic systems under dynamic, partially known environments with temporal logic specifications , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[11] Shlomo Zilberstein,et al. Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs , 2010, Autonomous Agents and Multi-Agent Systems.
[12] Emanuel Todorov,et al. Inverse Optimal Control with Linearly-Solvable MDPs , 2010, ICML.
[13] E. Jaynes. On the rationale of maximum-entropy methods , 1982, Proceedings of the IEEE.
[14] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[15] Stephanie Rosenthal,et al. Modeling humans as observation providers using POMDPs , 2011, 2011 RO-MAN.
[16] Nils Jansen,et al. Synthesis of shared control protocols with provable safety and performance guarantees , 2017, 2017 American Control Conference (ACC).
[17] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[18] Joost-Pieter Katoen,et al. The Probabilistic Model Checking Landscape* , 2016, 2016 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).
[19] Jake K. Aggarwal,et al. BWIBots: A platform for bridging the gap between AI and human–robot interaction research , 2017, Int. J. Robotics Res..
[20] W. Wong,et al. The calculation of posterior distributions by data augmentation , 1987 .
[21] David Barber,et al. On the Computational Complexity of Stochastic Controller Optimization in POMDPs , 2011, TOCT.
[22] Konrad Paul Kording,et al. Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .
[23] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[24] L Poole David,et al. Artificial Intelligence: Foundations of Computational Agents , 2010 .
[25] Manuela M. Veloso,et al. Oracular Partially Observable Markov Decision Processes: A Very Special Case , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[26] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[27] VelosoManuela,et al. A survey of robot learning from demonstration , 2009 .
[28] Peter Stone,et al. A Multiagent Approach to Autonomous Intersection Management , 2008, J. Artif. Intell. Res..
[29] Marta Z. Kwiatkowska,et al. PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.
[30] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[31] Nicholas Roy,et al. Global A-Optimal Robot Exploration in SLAM , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.
[32] Sebastian Junges,et al. Permissive Finite-State Controllers of POMDPs using Parameter Synthesis , 2017, ArXiv.
[33] Krishnendu Chatterjee,et al. Qualitative analysis of POMDPs with temporal logic specifications for robotics applications , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[34] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[35] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[36] Nils Jansen,et al. Symbolic counterexample generation for large discrete-time Markov chains , 2014, Sci. Comput. Program..
[37] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[38] Nils Jansen,et al. Counterexample Generation for Discrete-Time Markov Models: An Introductory Survey , 2014, SFM.
[39] Krishnendu Chatterjee,et al. Optimal cost almost-sure reachability in POMDPs , 2014, Artif. Intell..
[40] Gethin Norman,et al. Verification and control of partially observable probabilistic systems , 2017, Real-Time Systems.
[41] Shou-De Lin,et al. Designing the Market Game for a Trading Agent Competition , 2001, IEEE Internet Comput..
[42] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[43] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[44] Sebastian Thrun,et al. Probabilistic robotics , 2002, CACM.