Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning
暂无分享,去创建一个
Jorge Cortes | Michael Ouimet | Aaron Ma | J. Cortés | M. Ouimet | Aaron Ma
[1] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[2] Illah R. Nourbakhsh,et al. Planning for Human-Robot Interaction Using Time-State Aggregated POMDPs , 2008, AAAI.
[3] Leslie Pack Kaelbling,et al. Approximate Planning in POMDPs with Macro-Actions , 2003, NIPS.
[4] Jonathan P. How,et al. Decentralized control of partially observable Markov decision processes , 2015, 52nd IEEE Conference on Decision and Control.
[5] Arkadi Nemirovski,et al. Robust Convex Optimization , 1998, Math. Oper. Res..
[6] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[7] Jorge Cortes,et al. Distributed Control of Robotic Networks: A Mathematical Approach to Motion Coordination Algorithms , 2009 .
[8] Zhengzhu Feng,et al. Dynamic Programming for POMDPs Using a Factored State Representation , 2000, AIPS.
[9] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[10] Jorge Cortés,et al. Dynamic domain reduction for multi-agent planning , 2017, 2017 International Symposium on Multi-Robot and Multi-Agent Systems (MRS).
[11] R. Bellman. Dynamic programming. , 1957, Science.
[12] Basel Alomair,et al. Submodularity in Dynamics and Control of Networked Systems , 2016 .
[13] Gaurav S. Sukhatme,et al. Data-driven robotic sampling for marine ecosystem monitoring , 2015, Int. J. Robotics Res..
[14] M. Campi,et al. The scenario approach for systems and control design , 2008 .
[15] Stuart J. Russell,et al. Markovian State and Action Abstractions for MDPs via Hierarchical MCTS , 2016, IJCAI.
[16] Jorge Cortes,et al. Coordinated Control of Multi-Robot Systems: A Survey , 2017 .
[17] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[18] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[19] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[20] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[21] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[22] Steven M. LaValle,et al. Rapidly-Exploring Random Trees: Progress and Prospects , 2000 .
[23] Nicholas Roy,et al. The Belief Roadmap: Efficient Planning in Linear POMDPs by Factoring the Covariance , 2007, ISRR.
[24] M. L. Fisher,et al. An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..
[25] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[26] Magnus Egerstedt,et al. Graph Theoretic Methods in Multiagent Networks , 2010, Princeton Series in Applied Mathematics.
[27] Laurent El Ghaoui,et al. Robust Solutions to Uncertain Semidefinite Programs , 1998, SIAM J. Optim..
[28] Andreas S. Schulz,et al. Revisiting the Greedy Approach to Submodular Set Function Maximization , 2007 .
[29] Lino Marques,et al. Robots for Environmental Monitoring: Significant Advancements and Applications , 2012, IEEE Robotics & Automation Magazine.
[30] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[31] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[32] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[33] David R. Karger,et al. Approximation algorithms for orienteering and discounted-reward TSP , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..
[34] Andreas Krause,et al. Guarantees for Greedy Maximization of Non-submodular Functions with Applications , 2017, ICML.
[35] Abhimanyu Das,et al. Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.
[36] Nancy M. Amato,et al. FIRM: Feedback controller-based information-state roadmap - A framework for motion planning under uncertainty , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[37] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .