Policy teaching through reward function learning
暂无分享,去创建一个
[1] Craig Boutilier,et al. Regret-based Utility Elicitation in Constraint-based Decision Problems , 2005, IJCAI.
[2] Santosh S. Vempala,et al. Solving convex programs by random walks , 2004, JACM.
[3] Sven Rady,et al. Optimal Experimentation in a Changing Environment , 1997 .
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] Craig Boutilier,et al. Incremental utility elicitation with minimax regret decision criterion , 2003, IJCAI 2003.
[6] Eyal Amir,et al. Bayesian Inverse Reinforcement Learning , 2007, IJCAI.
[7] Craig Boutilier,et al. New Approaches to Optimization and Utility Elicitation in Autonomic Computing , 2005, AAAI.
[8] Mia Stern,et al. Applications of AI in education , 1996, CROS.
[9] Moshe Tennenholtz,et al. k-Implementation , 2003, EC '03.
[10] T. Mulgan. The Contract Theory , 2006 .
[11] C. Boutilier,et al. Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..
[12] Krzysztof Z. Gajos,et al. Preference elicitation for interface optimization , 2005, UIST.
[13] Jesse Hoey,et al. A planning system based on Markov decision processes to guide people with dementia through activities of daily living , 2006, IEEE Transactions on Information Technology in Biomedicine.
[14] Noam Nisan,et al. Proceedings of the 4th ACM conference on Electronic commerce , 2003 .
[15] Tuomas Sandholm,et al. Preference elicitation in combinatorial auctions , 2001, AAMAS '02.
[16] David C. Parkes,et al. A General Approach to Environment Design with One Agent , 2009, IJCAI.
[17] Robert J. Vanderbei,et al. Linear Programming: Foundations and Extensions , 1998, Kluwer international series in operations research and management service.
[18] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[19] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[20] H. Varian. Revealed Preference , 2006 .
[21] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[22] Moshe Babaioff,et al. Mixed Strategies in Combinatorial Agency , 2006, WINE.
[23] Moshe Babaioff,et al. Algorithmic Game Theory: Incentives in Peer-to-Peer Systems , 2007 .
[24] Craig Boutilier,et al. A Bayesian Approach to Imitation in Reinforcement Learning , 2003, IJCAI.
[25] Moshe Babaioff,et al. Combinatorial agency , 2006, EC '06.
[26] B. Grünbaum. Partitions of mass-distributions and of convex bodies by hyperplanes. , 1960 .
[27] Scott Shenker,et al. Hidden-action in multi-hop routing , 2005, EC '05.
[28] Krzysztof Z. Gajos,et al. Automatically generating custom user interfaces for users with physical disabilities , 2006, Assets '06.
[29] Daphne Koller,et al. Learning an Agent's Utility Function by Observing Behavior , 2001, ICML.
[30] D. Bergemann,et al. Learning and Strategic Pricing , 1996 .
[31] Craig Boutilier,et al. Eliciting Bid Taker Non-price Preferences in (Combinatorial) Auctions , 2004, AAAI.
[32] Luis Rademacher,et al. Approximating the centroid is hard , 2007, SCG '07.
[33] Craig Boutilier,et al. Constraint-Based Optimization with the Minimax Decision Criterion , 2003, CP.
[34] Rajesh P. N. Rao,et al. A Probabilistic Framework for Model-Based Imitation Learning , 2004 .
[35] David C. Parkes,et al. Value-Based Policy Teaching with Active Indirect Elicitation , 2008, AAAI.
[36] Craig Boutilier,et al. A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.
[37] Hao Zhang,et al. A Dynamic Principal-Agent Model with Hidden Information: Sequential Optimality Through Truthful State Revelation , 2008, Oper. Res..
[38] Nicole Immorlica,et al. Game-Theoretic Aspects of Designing Hyperlink Structures , 2006, WINE.
[39] Daphne Koller,et al. Making Rational Decisions Using Adaptive Utility Elicitation , 2000, AAAI/IAAI.