Counterexample-Guided Strategy Improvement for POMDPs Using Recurrent Neural Networks
暂无分享,去创建一个
Nils Jansen | Ufuk Topcu | Bernd Becker | Ralf Wimmer | Steven Carr | Alexandru Constantin Serban | B. Becker | U. Topcu | N. Jansen | Steven Carr | Ralf Wimmer | A. Serban
[1] Mykel J. Kochenderfer,et al. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks , 2017, CAV.
[2] Robert M Thrall,et al. Mathematics of Operations Research. , 1978 .
[3] B. Becker,et al. Finite-State Controllers of POMDPs using Parameter Synthesis , 2018, UAI.
[4] Petter Nilsson,et al. Temporal Logic Control of POMDPs via Label-based Stochastic Simulation Relations , 2018, ADHS.
[5] Demis Hassabis,et al. Neural Episodic Control , 2017, ICML.
[6] Christel Baier,et al. Principles of model checking , 2008 .
[7] M. V. Rossum,et al. In Neural Computation , 2022 .
[8] Razvan Pascanu,et al. Relational recurrent neural networks , 2018, NeurIPS.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[11] Nils Jansen,et al. Minimal counterexamples for linear-time probabilistic verification , 2014, Theor. Comput. Sci..
[12] Ruslan Salakhutdinov,et al. Neural Map: Structured Memory for Deep Reinforcement Learning , 2017, ICLR.
[13] Andrew Pitts,et al. Computation Theory , 1985, Lecture Notes in Computer Science.
[14] Sebastian Junges,et al. Permissive Finite-State Controllers of POMDPs using Parameter Synthesis , 2017, ArXiv.
[15] Krishnendu Chatterjee,et al. Qualitative analysis of POMDPs with temporal logic specifications for robotics applications , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[16] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[17] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[18] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[19] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[20] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[21] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[22] Krishnendu Chatterjee,et al. What is decidable about partially observable Markov decision processes with ω-regular objectives , 2013, J. Comput. Syst. Sci..
[23] David Barber,et al. On the Computational Complexity of Stochastic Controller Optimization in POMDPs , 2011, TOCT.
[24] Marta Z. Kwiatkowska,et al. PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.
[25] Fred Kröger,et al. Temporal Logic of Programs , 1987, EATCS Monographs on Theoretical Computer Science.
[26] Sebastian Junges,et al. A Storm is Coming: A Modern Probabilistic Model Checker , 2017, CAV.
[27] J. van Leeuwen,et al. Theoretical Computer Science , 2003, Lecture Notes in Computer Science.
[28] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[29] Matthijs T. J. Spaan,et al. Accelerated Vector Pruning for Optimal POMDP Solvers , 2017, AAAI.
[30] Ufuk Topcu,et al. Environment-Independent Task Specifications via GLTL , 2017, ArXiv.
[31] Shlomo Zilberstein,et al. Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs , 2010, Autonomous Agents and Multi-Agent Systems.
[32] Gethin Norman,et al. Verification and control of partially observable probabilistic systems , 2017, Real-Time Systems.
[33] Lijun Zhang,et al. Probabilistic Reachability for Parametric Markov Models , 2009, SPIN.
[34] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[35] Sebastian Junges,et al. Synthesis in pMDPs: A Tale of 1001 Parameters , 2018, ATVA.
[36] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[37] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[38] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.