暂无分享,去创建一个
[1] Ming-Yu Liu,et al. Tactics of Adversarial Attack on Deep Reinforcement Learning Agents , 2017, IJCAI.
[2] Fabio Roli,et al. Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning , 2018, CCS.
[3] Shipra Agrawal,et al. Optimistic posterior sampling for reinforcement learning: worst-case regret bounds , 2022, NIPS.
[4] Amin Karbasi,et al. On Actively Teaching the Crowd to Classify , 2013, NIPS 2013.
[5] Pedram Daee,et al. Machine Teaching of Active Sequential Learners , 2019, NeurIPS.
[6] Michael Kearns,et al. On the complexity of teaching , 1991, COLT '91.
[7] Yevgeniy Vorobeychik,et al. Data Poisoning Attacks on Factorization-Based Collaborative Filtering , 2016, NIPS.
[8] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[9] Percy Liang,et al. Stronger data poisoning attacks break data sanitization defenses , 2018, Machine Learning.
[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[11] Volkan Cevher,et al. Interactive Teaching Algorithms for Inverse Reinforcement Learning , 2019, IJCAI.
[12] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[13] Felipe Leno da Silva,et al. A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems , 2019, J. Artif. Intell. Res..
[14] Ness Shroff,et al. Data Poisoning Attacks on Stochastic Bandits , 2019, ICML.
[15] Manuel Lopes,et al. Algorithmic and Human Teaching of Sequential Decision Tasks , 2012, AAAI.
[16] Andreas Krause,et al. Near-Optimally Teaching the Crowd to Classify , 2014, ICML.
[17] Yishay Mansour,et al. Experts in a Markov Decision Process , 2004, NIPS.
[18] Xiaojin Zhu,et al. Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models , 2019, NeurIPS.
[19] Lihong Li,et al. Adversarial Attacks on Stochastic Bandits , 2018, NeurIPS.
[20] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[21] J. Doug Tygar,et al. Adversarial machine learning , 2019, AISec '11.
[22] Vern Paxson,et al. What's Clicking What? Techniques and Innovations of Today's Clickbots , 2011, DIMVA.
[23] Thomas J. Walsh,et al. Dynamic Teaching in Sequential Decision Making Environments , 2012, UAI.
[24] Sebastian Tschiatschek,et al. Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints , 2019, NeurIPS.
[25] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.
[26] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[27] Prasad Tadepalli,et al. H-Learning: A Reinforcement Learning Method for Optimizing Undiscounted Average Reward , 1994 .
[28] Pietro Perona,et al. Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners , 2018, NeurIPS.
[29] Quanyan Zhu,et al. Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost Signals , 2019, GameSec.
[30] Xiaojin Zhu,et al. Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.
[31] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[32] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[33] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[34] Manuela M. Veloso,et al. Interactive robot task training through dialog and demonstration , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[35] Rómer Rosales,et al. Simple and Scalable Response Prediction for Display Advertising , 2014, ACM Trans. Intell. Syst. Technol..
[36] Scott Niekum,et al. Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications , 2018, AAAI.
[37] Peter Auer,et al. Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning , 2006, NIPS.
[38] Blaine Nelson,et al. Poisoning Attacks against Support Vector Machines , 2012, ICML.
[39] David C. Parkes,et al. Value-Based Policy Teaching with Active Indirect Elicitation , 2008, AAAI.
[40] Seong Joon Oh,et al. Sequential Attacks on Agents for Long-Term Adversarial Goals , 2018, ArXiv.
[41] Sebastian Tschiatschek,et al. Teaching Inverse Reinforcement Learners via Features and Demonstrations , 2018, NeurIPS.
[42] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[43] Sridhar Mahadevan,et al. Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.
[44] Yuxin Chen,et al. Understanding the Power and Limitations of Teaching with Imperfect Knowledge , 2020, IJCAI.
[45] Sandy H. Huang,et al. Adversarial Attacks on Neural Network Policies , 2017, ICLR.
[46] Claudia Eckert,et al. Is Feature Selection Secure against Training Data Poisoning? , 2015, ICML.
[47] Xiaojin Zhu,et al. Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education , 2015, AAAI.
[48] Xiaojin Zhu,et al. Policy Poisoning in Batch Reinforcement Learning and Control , 2019, NeurIPS.
[49] Xiaojin Zhu,et al. An Optimal Control View of Adversarial Machine Learning , 2018, ArXiv.
[50] Sandra Zilles,et al. An Overview of Machine Teaching , 2018, ArXiv.
[51] Paul Barford,et al. Data Poisoning Attacks against Autoregressive Models , 2016, AAAI.
[52] Lihong Li,et al. Data Poisoning Attacks in Contextual Bandits , 2018, GameSec.
[53] David C. Parkes,et al. Policy teaching through reward function learning , 2009, EC '09.
[54] Meikang Qiu,et al. Reinforcement Learning for Cyber-Physical Systems , 2019, 2019 IEEE International Conference on Industrial Internet (ICII).