暂无分享,去创建一个
Matthew E. Taylor | Tim Brys | Peter Vamplew | Richard Dazeley | Cameron Foale | Adam Bignold | Francisco Cruz | T. Brys | P. Vamplew | R. Dazeley | Francisco Cruz | Adam Bignold | Cameron Foale | Richard Dazeley | Tim Brys
[1] Ioannis P. Vlahavas,et al. Reinforcement learning agents providing advice in complex video games , 2014, Connect. Sci..
[2] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[3] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[4] Felipe Leno da Silva,et al. A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems , 2019, J. Artif. Intell. Res..
[5] Mostafa Ghobaei-Arani,et al. Joint computation offloading and resource provisioning for edge‐cloud computing environment: A machine learning‐based approach , 2020, Softw. Pract. Exp..
[6] G. Zayaraz,et al. A Brief Survey on Concept Drift , 2015 .
[7] Stefan Wermter,et al. Training Agents With Interactive Reinforcement Learning and Contextual Affordances , 2016, IEEE Transactions on Cognitive and Developmental Systems.
[8] Gerhard Weiss,et al. Reinforcement Learning Transfer Using a Sparse Coded Inter-task Mapping , 2011, EUMAS.
[9] W.D. Smart,et al. What does shaping mean for computational reinforcement learning? , 2008, 2008 7th IEEE International Conference on Development and Learning.
[10] Alessandra Sciutti,et al. Learning from Learners: Adapting Reinforcement Learning Agents to be Competitive in a Card Game , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[12] Patrick M. Pilarski,et al. Between Instruction and Reward: Human-Prompted Switching , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[13] Peter Vamplew,et al. Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning , 2020, ArXiv.
[14] Sonia Chernova,et al. Effect of human guidance and state space size on Interactive Reinforcement Learning , 2011, 2011 RO-MAN.
[15] Ashwin Ram,et al. Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL , 2007, IJCAI.
[16] Mostafa Ghobaei-Arani,et al. A learning‐based approach for virtual machine placement in cloud data centers , 2018, Int. J. Commun. Syst..
[17] Sam Devlin,et al. Overcoming erroneous domain knowledge in plan-based reward shaping , 2013, AAMAS.
[18] Ofra Amir,et al. Interactive Teaching Strategies for Agent Training , 2016, IJCAI.
[19] K. R. Dixon,et al. Incorporating Prior Knowledge and Previously Learned Information into Reinforcement Learning Agents , 2000 .
[20] Raúl Santos-Rodríguez,et al. Online Feature Selection for Activity Recognition using Reinforcement Learning with Multiple Feedback , 2019, ArXiv.
[21] András György,et al. Learning from Delayed Outcomes with Intermediate Observations , 2018, ArXiv.
[22] Antonio Bandera,et al. A Survey of Vision-Based Architectures for Robot Learning by Imitation , 2012, Int. J. Humanoid Robotics.
[23] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.
[24] Bruno J. T. Fernandes,et al. A Robust Approach for Continuous Interactive Actor-Critic Algorithms , 2021, IEEE Access.
[25] Christian R. Shelton,et al. Balancing Multiple Sources of Reward in Reinforcement Learning , 2000, NIPS.
[26] Peter Vamplew,et al. Explainable robotic systems: understanding goal-driven actions in a reinforcement learning scenario , 2020, Neural Computing and Applications.
[27] Ioannis P. Vlahavas,et al. Learning to Teach Reinforcement Learning Agents , 2017, Mach. Learn. Knowl. Extr..
[28] Stefan Wermter,et al. Improving reinforcement learning with interactive feedback and affordances , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.
[29] Peter Stone,et al. Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study , 2006, RoboCup.
[30] Eric Eaton,et al. Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment , 2015, AAAI.
[31] C. Boutilier,et al. Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..
[32] Neda Navidi,et al. Human AI interaction loop training: New approach for interactive reinforcement learning , 2020, ArXiv.
[33] Masashi Sugiyama,et al. Active deep Q-learning with demonstration , 2018, Machine Learning.
[34] Stefan Wermter,et al. The Hybrid Integration of Perceptual Symbol Systems and Interactive Reinforcement Learning , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.
[35] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[36] András György,et al. Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems , 2018, ICML.
[37] Peter Stone,et al. Source Task Creation for Curriculum Learning , 2016, AAMAS.
[38] Gabriel Dulac-Arnold,et al. Challenges of Real-World Reinforcement Learning , 2019, ArXiv.
[39] Peter Stone,et al. Reinforcement learning from simultaneous human and MDP reward , 2012, AAMAS.
[40] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..
[41] Francisco Cruz,et al. Reinforcement learning using continuous states and interactive feedback , 2019, APPIS '19.
[42] Andrea Lockerd Thomaz,et al. Reinforcement Learning with Human Teachers: Understanding How People Want to Teach Robots , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.
[43] Stefan Wermter,et al. Multi-modal Feedback for Affordance-driven Interactive Reinforcement Learning , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).
[44] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[45] Sonia Chernova,et al. Reinforcement Learning from Demonstration through Shaping , 2015, IJCAI.
[46] Andrea Lockerd Thomaz,et al. Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.
[47] Andrea Lockerd Thomaz,et al. Asymmetric Interpretations of Positive and Negative Human Feedback for a Social Learning Agent , 2007, RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication.
[48] Shie Mannor,et al. Bayesian Reinforcement Learning , 2012, Reinforcement Learning.
[49] Sonia Chernova,et al. Learning from Demonstration for Shaping through Inverse Reinforcement Learning , 2016, AAMAS.
[50] Stefan Wermter,et al. Agent-advising approaches in an interactive reinforcement learning scenario , 2017, 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).
[51] Peter Vamplew,et al. Memory-Based Explainable Reinforcement Learning , 2019, Australasian Conference on Artificial Intelligence.
[52] Peter Stone,et al. Cobot in LambdaMOO: A Social Statistics Agent , 2000, AAAI/IAAI.
[53] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[54] Luís Nunes,et al. Exchanging Advice and Learning to Trust , 2003, CIA.
[55] Stefan Wermter,et al. Improving interactive reinforcement learning: What makes a good teacher? , 2018, Connect. Sci..
[56] Robert H. Deng,et al. Privacy-Preserving Reinforcement Learning Design for Patient-Centric Dynamic Treatment Regimes , 2019, IEEE Transactions on Emerging Topics in Computing.
[57] Peter Stone,et al. Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.
[58] Volkan Cevher,et al. Interactive Teaching Algorithms for Inverse Reinforcement Learning , 2019, IJCAI.
[59] Scott Sanner,et al. Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach , 2018, NeurIPS.
[60] Yusen Zhan,et al. Efficiently detecting switches against non-stationary opponents , 2017, Autonomous Agents and Multi-Agent Systems.
[61] Peter Vamplew,et al. Persistent Rule-based Interactive Reinforcement Learning , 2021, Neural Computing and Applications.
[62] Hiroaki Kitano,et al. RoboCup: A Challenge Problem for AI , 1997, AI Mag..
[63] Matthew E. Taylor. Assisting Transfer-Enabled Machine Learning Algorithms: Leveraging Human Knowledge for Curriculum Design , 2009, AAAI Spring Symposium: Agents that Learn from Human Teachers.
[64] Pierre-Yves Oudeyer,et al. Robot learning simultaneously a task and how to interpret human instructions , 2013, 2013 IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[65] Sam Devlin,et al. Dynamic potential-based reward shaping , 2012, AAMAS.
[66] Dongbin Zhao,et al. StarCraft Micromanagement With Reinforcement Learning and Curriculum Transfer Learning , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.
[67] Y. Niv. Reinforcement learning in the brain , 2009 .
[68] Mohammad Masdari,et al. A Survey on the Computation Offloading Approaches in Mobile Edge/Cloud Computing Environment: A Stochastic-based Perspective , 2020, Journal of Grid Computing.
[69] Matthew E. Taylor,et al. Curriculum Design for Machine Learners in Sequential Decision Tasks , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.
[70] Erik Talvitie,et al. An Experts Algorithm for Transfer Learning , 2007, IJCAI.
[71] Stefan Wermter,et al. Accelerating Deep Continuous Reinforcement Learning through Task Simplification , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).
[72] Sonia Chernova,et al. Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.
[73] Stefan Wermter,et al. Interactive reinforcement learning through speech guidance in a domestic scenario , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).
[74] Stefan Wermter,et al. Curriculum goal masking for continuous deep reinforcement learning , 2018, 2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).
[75] Francisco Cruz,et al. Unmanned Aerial Vehicle Control Through Domain-based Automatic Speech Recognition , 2020, Comput..
[76] Alessandra Sciutti,et al. Moody Learners - Explaining Competitive Behaviour of Reinforcement Learning Agents , 2020, 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).
[77] Shimon Whiteson,et al. Inverse Reinforcement Learning from Failure , 2016, AAMAS.
[78] Matthew E. Taylor,et al. Useful Policy Invariant Shaping from Arbitrary Advice , 2020, ArXiv.
[79] Stefan Wermter,et al. Multi-modal integration of dynamic audiovisual patterns for an interactive reinforcement learning scenario , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[80] Thommen George Karimpanal,et al. Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).
[81] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.
[82] Brett Browning,et al. Automatic weight learning for multiple data sources when learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.
[83] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[84] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[85] Peter Vamplew,et al. Explainable robotic systems: Interpreting outcome-focused actions in a reinforcement learning scenario , 2020, ArXiv.
[86] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.
[87] Felipe Leno da Silva,et al. Object-Oriented Curriculum Generation for Reinforcement Learning , 2018, AAMAS.
[88] Andrea Lockerd Thomaz,et al. Exploration from Demonstration for Interactive Reinforcement Learning , 2016, AAMAS.
[89] Aude Billard,et al. Transfer in inverse reinforcement learning for multiple strategies , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[90] Anna Saranti,et al. Towards multi-modal causability with Graph Neural Networks enabling information fusion for explainable AI , 2021, Inf. Fusion.
[91] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[92] Bo He,et al. Human-Centered Reinforcement Learning: A Survey , 2019, IEEE Transactions on Human-Machine Systems.
[93] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[94] Garrison W. Cottrell,et al. Principled Methods for Advising Reinforcement Learning Agents , 2003, ICML.
[95] Pablo Hernandez-Leal,et al. Uncertainty-Aware Action Advising for Deep Reinforcement Learning Agents , 2020, AAAI.
[96] Eduardo F. Morales,et al. Dynamic Reward Shaping: Training a Robot by Voice , 2010, IBERAMIA.
[97] Felipe Leno da Silva,et al. Simultaneously Learning and Advising in Multiagent Reinforcement Learning , 2017, AAMAS.
[98] Matthew Hausknecht and Peter Stone,et al. Half Field Offense: An Environment for Multiagent Learning and Ad Hoc Teamwork , 2016 .
[99] Leslie Pack Kaelbling,et al. Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[100] Felipe Leno Da Silva. Integrating Agent Advice and Previous Task Solutions in Multiagent Reinforcement Learning , 2019, AAMAS.
[101] Keisuke Nakamura,et al. A Review on Interactive Reinforcement Learning From Human Social Feedback , 2020, IEEE Access.
[102] Reid G. Simmons,et al. Complexity Analysis of Real-Time Reinforcement Learning , 1993, AAAI.
[103] Ioannis Vlahavas,et al. Reinforcement Learning and Automated Planning: A Survey , 2008 .
[104] WhitesonShimon,et al. A survey of multi-objective sequential decision-making , 2013 .
[105] Bikramjit Banerjee,et al. General Game Learning Using Knowledge Transfer , 2007, IJCAI.
[106] Yusen Zhan,et al. Theoretically-Grounded Policy Advice from Multiple Teachers in Reinforcement Learning Settings with Applications to Negative Transfer , 2016, IJCAI.
[107] Peter Vamplew,et al. A Demonstration of Issues with Value-Based Multiobjective Reinforcement Learning Under Stochastic State Transitions , 2020, ArXiv.
[108] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.
[109] Andreas Holzinger,et al. Measuring the Quality of Explanations: The System Causability Scale (SCS) , 2020, KI - Künstliche Intelligenz.
[110] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[111] Bruno J. T. Fernandes,et al. Human feedback in continuous actor-critic reinforcement learning , 2019, ESANN.
[112] Peter Vamplew,et al. An Evaluation Methodology for Interactive Reinforcement Learning with Simulated Users , 2021, Biomimetics.
[113] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[114] Sheng-Tzong Cheng,et al. A framework of an agent planning with reinforcement learning for e-pet , 2013, 2013 1st International Conference on Orange Technologies (ICOT).
[115] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[116] Nikhil Churamani,et al. iCub: Learning Emotion Expressions using Human Reward , 2020, ArXiv.
[117] Matthew E. Taylor,et al. Combining Multiple Correlated Reward and Shaping Signals by Measuring Confidence , 2014, AAAI.
[118] Wenbing Huang,et al. Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance , 2019, AAAI.
[119] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[120] Reinaldo A. C. Bianchi,et al. Transferring knowledge as heuristics in reinforcement learning: A case-based approach , 2015, Artif. Intell..
[121] Maya Cakmak,et al. Power to the People: The Role of Humans in Interactive Machine Learning , 2014, AI Mag..
[122] Peter Stone,et al. Autonomous Task Sequencing for Customized Curriculum Design in Reinforcement Learning , 2017, IJCAI.
[123] Matthew E. Taylor,et al. Multi-objectivization and ensembles of shapings in reinforcement learning , 2017, Neurocomputing.
[124] Peter Stone,et al. Agents teaching agents: a survey on inter-agent transfer learning , 2019, Autonomous Agents and Multi-Agent Systems.
[125] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[126] Bradley C. Love,et al. How Humans Teach Agents - A New Experimental Perspective , 2012, Int. J. Soc. Robotics.
[127] Sam Devlin,et al. Expressing Arbitrary Reward Functions as Potential-Based Advice , 2015, AAAI.
[128] Andrea Lockerd Thomaz,et al. Active Attention-Modified Policy Shaping: Socially Interactive Agents Track , 2019, AAMAS.
[129] Roland Siegwart,et al. Comparing Task Simplifications to Learn Closed-Loop Object Picking Using Deep Reinforcement Learning , 2018, IEEE Robotics and Automation Letters.
[130] Luca Maria Gambardella,et al. Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem , 1995, ICML.
[131] Mohan Sridharan,et al. What Can I Not Do? Towards an Architecture for Reasoning about and Learning Affordances , 2017, ICAPS.
[132] Eyke Hüllermeier,et al. Preference-based reinforcement learning: a formal framework and a policy iteration algorithm , 2012, Mach. Learn..
[133] Peter Stone,et al. Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..
[134] Scott Kuindersma,et al. Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..
[135] Sergio M. M. Fernandes,et al. A Robust Approach for Continuous Interactive Reinforcement Learning , 2020, HAI.
[136] Peter Stone,et al. Agents teaching agents: a survey on inter-agent transfer learning , 2020 .
[137] Reinaldo A. C. Bianchi,et al. Heuristic Reinforcement Learning Applied to RoboCup Simulation Agents , 2008, RoboCup.
[138] Cynthia Breazeal,et al. Training a Robot via Human Feedback: A Case Study , 2013, ICSR.
[139] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[140] Hui Xiong,et al. A Comprehensive Survey on Transfer Learning , 2019, Proceedings of the IEEE.
[141] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[142] Andrea Lockerd Thomaz,et al. Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains , 2014, Artif. Intell..
[143] Stefan Wermter,et al. Learning contextual affordances with an associative neural architecture , 2016, ESANN.
[144] Pierre-Yves Oudeyer,et al. Robotic clicker training , 2002, Robotics Auton. Syst..
[145] Marco Wiering,et al. Ensemble Algorithms in Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[146] B F Skinner,et al. The shaping of phylogenic behavior. , 1975, Journal of the experimental analysis of behavior.
[147] Yaneer Bar-Yam,et al. Segregation dynamics with reinforcement learning and agent based modeling , 2019, Scientific Reports.
[148] Shimon Whiteson,et al. Transfer via inter-task mappings in policy search reinforcement learning , 2007, AAMAS '07.
[149] Andreas Holzinger,et al. Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.
[150] Brett Browning,et al. Learning by demonstration with critique from a human teacher , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[151] Eric Horvitz,et al. Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.
[152] Takeo Igarashi,et al. A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges , 2020, Conference on Designing Interactive Systems.
[153] Gerald Tesauro,et al. Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.
[154] Cynthia Breazeal,et al. Real-Time Interactive Reinforcement Learning for Robots , 2005 .
[155] Peter Stone,et al. Value Functions for RL-Based Behavior Transfer: A Comparative Study , 2005, AAAI.
[156] Richard Dazeley,et al. Deep Reinforcement Learning with Interactive Feedback in a Human-Robot Environment , 2020, ArXiv.
[157] Peter Stone,et al. Autonomous transfer for reinforcement learning , 2008, AAMAS.
[158] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[159] Kening Zhu,et al. Emergency-Response Locomotion of Hexapod Robot with Heuristic Reinforcement Learning Using Q-Learning , 2019, ICR.
[160] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.
[161] Bing Liu,et al. Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.
[162] Sam Devlin,et al. Theoretical considerations of potential-based reward shaping for multi-agent systems , 2011, AAMAS.
[163] Richard Evans,et al. Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.
[164] Stefan Wermter,et al. Action Selection Methods in a Robotic Reinforcement Learning Scenario , 2018, 2018 IEEE Latin American Conference on Computational Intelligence (LA-CCI).
[165] J. Karlsson,et al. Learning to Play Games from Multiple Imperfect Teachers , 2014 .
[166] Carme Torras,et al. A robot learning from demonstration framework to perform force-based manipulation tasks , 2013, Intelligent Service Robotics.
[167] Pierpaolo Pontrandolfo,et al. Inventory management in supply chains: a reinforcement learning approach , 2002 .
[168] Peter Stone,et al. Reinforcement learning from human reward: Discounting in episodic tasks , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.
[169] Stefan Wermter,et al. Continual Lifelong Learning with Neural Networks: A Review , 2018, Neural Networks.
[170] K. Subramanian,et al. Learning Options through Human Interaction , 2011 .
[171] Peter Vamplew,et al. Explainable reinforcement learning for broad-XAI: a conceptual framework and survey , 2021, Neural Computing and Applications.
[172] Peter Vamplew,et al. Levels of explainable artificial intelligence for human-aligned conversational explanations , 2021, Artif. Intell..
[173] Jiming Liu,et al. Partially Observable Reinforcement Learning for Sustainable Active Surveillance , 2018, KSEM.
[174] Matthew E. Taylor,et al. Teaching on a budget: agents advising agents in reinforcement learning , 2013, AAMAS.