Multi-objective decision-theoretic planning
暂无分享,去创建一个
[1] R. Bellman. A Markovian Decision Process , 1957 .
[2] A. S. Manne. Linear Programming and Sequential Decisions , 1960 .
[3] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[4] A. Sen,et al. Collective Choice and Social Welfare , 2017 .
[5] P. McMullen. The maximum numbers of faces of a convex polytope , 1970 .
[6] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[7] Ronald L. Graham,et al. An Efficient Algorithm for Determining the Convex Hull of a Finite Planar Set , 1972, Inf. Process. Lett..
[8] Ray A. Jarvis,et al. On the Identification of the Convex Hull of a Finite Set of Points in the Plane , 1973, Inf. Process. Lett..
[9] Arnon Rosenthal. Nonserial dynamic programming is optimal , 1977, STOC '77.
[10] G. Monahan. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .
[11] D. White. Multi-objective infinite-horizon discounted Markov decision processes , 1982 .
[12] Stefan Arnborg,et al. Efficient algorithms for combinatorial problems on graphs with bounded decomposability — A survey , 1985, BIT.
[13] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[14] Hsien-Te Cheng,et al. Algorithms for partially observable markov decision processes , 1989 .
[15] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[16] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[17] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[18] Daniel P. Miranker,et al. On the Space-Time Trade-off in Solving Constraint Satisfaction Problems , 1995, IJCAI.
[19] Arthur C. Graesser,et al. Is it an Agent, or Just a Program?: A Taxonomy for Autonomous Agents , 1996, ATAL.
[20] Anders R. Kristensen,et al. Dynamic programming and Markov decision processes , 1996 .
[21] Rina Dechter,et al. Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.
[22] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[23] Robert T. Clemen,et al. Making Hard Decisions: An Introduction to Decision Analysis , 1997 .
[24] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[25] G. Ziegler,et al. Basic properties of convex polytopes , 1997 .
[26] John N. Tsitsiklis,et al. Introduction to linear optimization , 1997, Athena scientific optimization and computation series.
[27] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[28] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[29] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[30] S. Mahadevan,et al. Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning , 1999 .
[31] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[32] Jesse Hoey,et al. SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.
[33] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[34] Luc Devroye,et al. Estimating the number of vertices of a polyhedron , 2000, Inf. Process. Lett..
[35] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[36] Eitan Altman,et al. Applications of Markov Decision Processes in Communication Networks , 2000 .
[37] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.
[38] Anne Condon,et al. On the undecidability of probabilistic planning and related stochastic optimization problems , 2003, Artif. Intell..
[39] Marc E. Pfetsch,et al. Some Algorithmic Problems in Polytope Theory , 2003, Algebra, Geometry, and Software Systems.
[40] Marco Laumanns,et al. Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..
[41] Claudia V. Goldman,et al. Transition-independent decentralized markov decision processes , 2003, AAMAS '03.
[42] Nikos A. Vlassis,et al. Sparse cooperative Q-learning , 2004, ICML.
[43] Nikos A. Vlassis,et al. Anytime algorithms for multiagent decision making using coordination graphs , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).
[44] Shlomo Zilberstein,et al. Region-Based Incremental Pruning for POMDPs , 2004, UAI.
[45] Patrice Perny,et al. GAI Networks for Utility Elicitation , 2004, KR.
[46] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[47] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..
[48] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.
[49] Rina Dechter,et al. The Relationship Between AND/OR Search and Variable Elimination , 2005, UAI.
[50] I. Y. Kim,et al. Adaptive weighted-sum method for bi-objective optimization: Pareto front generation , 2005 .
[51] David Furcy,et al. Limited Discrepancy Beam Search , 2005, IJCAI.
[52] John K. Slaney,et al. Decision-Theoretic Planning with non-Markovian Rewards , 2011, J. Artif. Intell. Res..
[53] Javier Larrosa,et al. Bucket elimination for multiobjective optimization problems , 2006, J. Heuristics.
[54] Nikos A. Vlassis,et al. Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..
[55] D. Bergemann,et al. Efficient Dynamic Auctions , 2006 .
[56] Joelle Pineau,et al. Anytime Point-Based Approximations for Large POMDPs , 2006, J. Artif. Intell. Res..
[57] László Monostori,et al. Agent-based systems for manufacturing , 2006 .
[58] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[59] Rina Dechter,et al. AND/OR search spaces for graphical models , 2007, Artif. Intell..
[60] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[61] David Levine,et al. Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning , 2007, NIPS.
[62] Srini Narayanan,et al. Learning all optimal policies with multiple criteria , 2008, ICML '08.
[63] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[64] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.
[65] Rina Dechter,et al. And/or search strategies for combinatorial optimization in graphical models , 2008 .
[66] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..
[67] Ruggiero Cavallo,et al. Efficiency and redistribution in dynamic mechanism design , 2008, EC '08.
[68] Emma Rollón,et al. Multi-objective optimization in graphical models , 2008 .
[69] Patrice Perny,et al. Multiobjective Optimization using GAI Models , 2009, IJCAI.
[70] Hisashi Handa. Solving Multi-objective Reinforcement Learning Problems by EDA-RL - Acquisition of Various Strategies , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.
[71] Andrei V. Kelarev,et al. Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks , 2009, Australasian Conference on Artificial Intelligence.
[72] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .
[73] Patrice Perny,et al. Choquet Optimization Using GAI Networks for Multiagent/Multicriteria Decision-Making , 2009, ADT.
[74] Hisashi Handa. EDA-RL: estimation of distribution algorithms for reinforcement learning problems , 2009, GECCO '09.
[75] Radu Marinescu,et al. Exploiting Problem Decomposition in Multi-objective Constraint Optimization , 2009, CP.
[76] Susan A. Murphy,et al. Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , 2010, ICML.
[77] Edmund H. Durfee,et al. Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.
[78] Frans A. Oliehoek,et al. Value-Based Planning for Teams of Agents in Stochastic Partially Observable Environments , 2010 .
[79] David Hsu,et al. Planning under Uncertainty for Robotic Tasks with Mixed Observability , 2010, Int. J. Robotics Res..
[80] Peter Auer,et al. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..
[81] Sven Koenig,et al. BnB-ADOPT: an asynchronous branch-and-bound DCOP algorithm , 2008, AAMAS.
[82] Radu Marinescu. Efficient Approximation Algorithms for Multi-objective Constraint Optimization , 2011, ADT.
[83] Qiang Liu,et al. Bounding the Partition Function using Holder's Inequality , 2011, ICML.
[84] D. Pardoe. Adaptive trading agent strategies using market experience , 2011 .
[85] Yiannis Demiris,et al. Evolving policies for multi-reward partially observable markov decision processes (MR-POMDPs) , 2011, GECCO '11.
[86] Tommi S. Jaakkola,et al. Introduction to dual composition for inference , 2011 .
[87] Qiang Liu,et al. Variational algorithms for marginal MAP , 2011, J. Mach. Learn. Res..
[88] Nicholas R. Jennings,et al. Bounded decentralised coordination over multiple objectives , 2011, AAMAS.
[89] Yiannis Demiris,et al. Multi-reward policies for medical applications: anthrax attacks and smart wheelchairs , 2011, GECCO.
[90] Kee-Eung Kim,et al. Closing the Gap: Improved Bounds on Optimal POMDP Solutions , 2011, ICAPS.
[91] Guy Shani,et al. Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .
[92] Istvan Szita,et al. Reinforcement Learning in Games , 2012, Reinforcement Learning.
[93] Thomas Keller,et al. PROST: Probabilistic Planning Based on UCT , 2012, ICAPS.
[94] Nic Wilson,et al. Multi-objective Influence Diagrams , 2012, UAI.
[95] Lars Otten,et al. Join-graph based cost-shifting schemes , 2012, UAI.
[96] Shie Mannor,et al. Bayesian Reinforcement Learning , 2012, Reinforcement Learning.
[97] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[98] Hado van Hasselt,et al. Reinforcement Learning in Continuous State and Action Spaces , 2012, Reinforcement Learning.
[99] Bo An,et al. Multi-objective optimization for security games , 2012, AAMAS.
[100] Matthijs T. J. Spaan,et al. Partially Observable Markov Decision Processes , 2010, Encyclopedia of Machine Learning.
[101] Shimon Whiteson,et al. Exploiting Structure in Cooperative Bayesian Games , 2012, UAI.
[102] Shimon Whiteson,et al. Computing Convex Coverage Sets for Multi-objective Coordination Graphs , 2013, ADT.
[103] Charles L. Isbell,et al. Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs , 2013, NIPS.
[104] Patrice Perny,et al. Approximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes , 2013, AAAI.
[105] Malte Helmert,et al. Trial-Based Heuristic Tree Search for Finite Horizon MDPs , 2013, ICAPS.
[106] Ashutosh Nayyar,et al. Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach , 2012, IEEE Transactions on Automatic Control.
[107] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[108] Ann Nowé,et al. Designing multi-objective multi-armed bandits algorithms: A study , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).
[109] Wolfgang Ketter,et al. Autonomous Agents in Future Energy Markets: The 2012 Power Trading Agent Competition , 2013, AAAI.
[110] Shimon Whiteson,et al. Multi-objective variable elimination for collaborative graphical games , 2013, AAMAS.
[111] Rina Dechter. Reasoning with Probabilistic and Deterministic Graphical Models: Exact Algorithms , 2013, Reasoning with Probabilistic and Deterministic Graphical Models: Exact Algorithms.
[112] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[113] Mathijs de Weerdt,et al. Planning under Uncertainty for Coordinating Infrastructural Maintenance , 2013, ICAPS.
[114] Olivier Buffet,et al. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Optimally Solving Dec-POMDPs as Continuous-State MDPs , 2022 .
[115] Leo van Moergestel,et al. Agent Technology in Agile Multiparallel Manufacturing and Product Support , 2014 .
[116] Shimon Whiteson,et al. Queued Pareto Local Search for Multi-Objective Optimization , 2014, PPSN.
[117] Shimon Whiteson,et al. Linear support for multi-objective coordination graphs , 2014, AAMAS.
[118] Bernard Manderick,et al. The scalarized multi-objective multi-armed bandit problem: An empirical study of its exploration vs. exploitation tradeoff , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).
[119] Peter R. Lewis,et al. A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).
[120] Marco Wiering,et al. Model-based multi-objective reinforcement learning , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[121] Shimon Whiteson,et al. Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty , 2014, ICAPS.
[122] Ann Nowé,et al. Multi-objective reinforcement learning using sets of pareto dominating policies , 2014, J. Mach. Learn. Res..
[123] Dec-POMDPs as Non-Observable MDPs , 2014 .
[124] Doina Precup,et al. Algorithms for multi-armed bandit problems , 2014, ArXiv.
[125] Shimon Whiteson,et al. Computing Convex Coverage Sets for Faster Multi-objective Coordination , 2015, J. Artif. Intell. Res..
[126] Shimon Whiteson,et al. Point-Based Planning for Multi-Objective POMDPs , 2015, IJCAI.
[127] Frans A. Oliehoek,et al. Structure in the value function of zero-sum games of incomplete information , 2015 .
[128] Shimon Whiteson. Pareto Local Policy Search for MOMDP Planning , 2015 .
[129] Frans A. Oliehoek,et al. Quality Assessment of MORL Algorithms: A Utility-Based Approach , 2015 .
[130] Nic Wilson,et al. Computing Possibly Optimal Solutions for Multi-Objective Constraint Optimisation with Tradeoffs , 2015, IJCAI.
[131] Diederik M. Roijers. Variational Multi-Objective Coordination , 2015 .
[132] Patrice Perny,et al. Incremental Weight Elicitation for Multiobjective State Space Search , 2015, AAAI.
[133] Shlomo Zilberstein,et al. Multi-Objective POMDPs with Lexicographic Reward Preferences , 2015, IJCAI.
[134] Frans A. Oliehoek,et al. Factored Upper Bounds for Multiagent Planning Problems under Uncertainty with Non-Factored Value Functions , 2015, IJCAI.
[135] Mathijs de Weerdt,et al. Solving Multi-agent MDPs Optimally with Conditional Return Graphs , 2015 .
[136] Mathijs de Weerdt,et al. Solving Transition-Independent Multi-Agent MDPs with Sparse Interactions , 2015, AAAI.
[137] Juliane Hahn,et al. Security And Game Theory Algorithms Deployed Systems Lessons Learned , 2016 .
[138] Shimon Whiteson,et al. Multi-Objective Decision Making , 2017, Synthesis Lectures on Artificial Intelligence and Machine Learning.