暂无分享,去创建一个
[1] Amin Karbasi,et al. On Actively Teaching the Crowd to Classify , 2013, NIPS 2013.
[2] Haipeng Luo,et al. Fast Convergence of Regularized Learning in Games , 2015, NIPS.
[3] Manuel Lopes,et al. Algorithmic and Human Teaching of Sequential Decision Tasks , 2012, AAAI.
[4] Xiaojin Zhu,et al. Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education , 2015, AAAI.
[5] Shie Mannor,et al. Markov Decision Processes with Arbitrary Reward Processes , 2008, Math. Oper. Res..
[6] Peter L. Bartlett,et al. Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions , 2013, NIPS.
[7] Karthik Sridharan,et al. Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.
[8] Elad Hazan,et al. Better Rates for Any Adversarial Deterministic MDP , 2013, ICML.
[9] Adam Tauman Kalai,et al. On agnostic boosting and parity learning , 2008, STOC.
[10] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[11] David C. Parkes,et al. Policy teaching through reward function learning , 2009, EC '09.
[12] Christos Dimitrakakis,et al. Multi-View Decision Processes: The Helper-AI Problem , 2017, NIPS.
[13] András György,et al. Online Learning in Markov Decision Processes with Changing Cost Sequences , 2014, ICML.
[14] Yishay Mansour,et al. Experts in a Markov Decision Process , 2004, NIPS.
[15] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[16] Shie Mannor,et al. Arbitrarily modulated Markov decision processes , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[17] András György,et al. The adversarial stochastic shortest path problem with unknown transition probabilities , 2012, AISTATS.
[18] Ofra Amir,et al. Interactive Teaching Strategies for Agent Training , 2016, IJCAI.
[19] Chen-Yu Wei,et al. Online Reinforcement Learning in Stochastic Games , 2017, NIPS.
[20] Andreas Krause,et al. Learning to Interact With Learning Agents , 2018, AAAI.
[21] Haipeng Luo,et al. Corralling a Band of Bandit Algorithms , 2016, COLT.
[22] Siddhartha S. Srinivasa,et al. Game-Theoretic Modeling of Human Adaptation in Human-Robot Collaboration , 2017, 2017 12th ACM/IEEE International Conference on Human-Robot Interaction (HRI.
[23] Csaba Szepesvári,et al. Online Markov Decision Processes Under Bandit Feedback , 2010, IEEE Transactions on Automatic Control.
[24] Krzysztof Pietrzak,et al. Cryptography from Learning Parity with Noise , 2012, SOFSEM.
[25] Rob Fergus,et al. Modeling Others using Oneself in Multi-Agent Reinforcement Learning , 2018, ICML.
[26] Thomas Steinke,et al. Learning hurdles for sleeping experts , 2012, ITCS '12.
[27] Mohammad Taghi Hajiaghayi,et al. Regret minimization and the price of total anarchy , 2008, STOC.
[28] Sandra Zilles,et al. An Overview of Machine Teaching , 2018, ArXiv.
[29] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[30] Shie Mannor,et al. Online learning in Markov decision processes with arbitrarily changing rewards and transitions , 2009, 2009 International Conference on Game Theory for Networks.
[31] Yishay Mansour,et al. Online Markov Decision Processes , 2009, Math. Oper. Res..
[32] Vatsal Sharan,et al. Prediction with a short memory , 2016, STOC.
[33] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..