Stefano Ermon,et al. A Theory of Usable Information Under Computational Constraints , 2020, ICLR.
 Henning Müller,et al. Regression Concept Vectors for Bidirectional Explanations in Histopathology , 2018, MLCN/DLF/iMIMIC@MICCAI.
 M. Gervasio,et al. Interestingness Elements for Explainable Reinforcement Learning: Understanding Agents' Capabilities and Limitations , 2019, Artif. Intell..
 Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, International Journal of Computer Vision.
 Andrea Vedaldi,et al. Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
 Ulrich Paquet,et al. Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess , 2020, ArXiv.
 Tiago Pimentel,et al. A Bayesian Framework for Information-Theoretic Probing , 2021, EMNLP.
 Ilkay Öksüz,et al. Global and Local Interpretability for Cardiac MRI Classification , 2019, MICCAI.
 Alan Fern,et al. Explainable Reinforcement Learning via Reward Decomposition , 2019 .
 H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.
 Ivan Titov,et al. Information-Theoretic Probing with Minimum Description Length , 2020, EMNLP.
 Finale Doshi-Velez,et al. Promises and Pitfalls of Black-Box Concept Learning Models , 2021, ArXiv.
 Tim Miller,et al. Explainable Reinforcement Learning Through a Causal Lens , 2020, AAAI.
 Artur S. d'Avila Garcez,et al. Towards Symbolic Reinforcement Learning with Common Sense , 2018, ArXiv.
 David Filliat,et al. DisCoRL: Continual Reinforcement Learning via Policy Distillation , 2019, ArXiv.
 Siddhartha Sen,et al. Aligning Superhuman AI with Human Behavior: Chess as a Model System , 2020, KDD.
 Yoshua Bengio,et al. Understanding intermediate layers using linear classifier probes , 2016, ICLR.
 Bolei Zhou,et al. Understanding the role of individual units in a deep neural network , 2020, Proceedings of the National Academy of Sciences.
 Ludovic Denoyer,et al. EDUCE: Explaining model Decisions through Unsupervised Concepts Extraction , 2019, ArXiv.
 Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
 Anna Rumshisky,et al. A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.
 Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.
 Siddhartha Sen,et al. Learning Personalized Models of Human Behavior in Chess , 2020, ArXiv.
 Volker Gruhn,et al. Domain-Level Explainability - A Challenge for Creating Trust in Superhuman AI Strategies , 2020, ArXiv.
 Omer Levy,et al. Emergent linguistic structure in artificial neural networks trained by self-supervision , 2020, Proceedings of the National Academy of Sciences.
 Hiroshi Kawano. Hierarchical sub-task decomposition for reinforcement learning of multi-robot delivery mission , 2013, 2013 IEEE International Conference on Robotics and Automation.
 Tess Berthier,et al. UBS: A Dimension-Agnostic Metric for Concept Vector Interpretability Applied to Radiomics , 2019, iMIMIC/ML-CDS@MICCAI.
 David Filliat,et al. S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning , 2018, ArXiv.
 Subbarao Kambhampati,et al. TLdR: Policy Summarization for Factored SSP Problems Using Temporal Abstractions , 2020, ICAPS.
 Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
 Yarin Gal,et al. Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.
 Yash Goyal,et al. Explaining Classifiers with Causal Concept Effect (CaCE) , 2019, ArXiv.
 K. Kersting,et al. Learning to Play the Chess Variant Crazyhouse Above World Champion Level With Deep Neural Networks and Human Data , 2019, Frontiers in Artificial Intelligence.
 C. Rudin,et al. Concept Whitening for Interpretable Image Recognition , 2020, Nat. Mach. Intell..
 Nicholas McCarthy,et al. SentiMATE: Learning to play Chess through Natural Language Processing , 2019, ArXiv.
 Shaobo Hou,et al. Concept-based model explanations for Electronic Health Records , 2020, ArXiv.
 Bolei Zhou,et al. Network Dissection: Quantifying Interpretability of Deep Visual Representations , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
 Alan Fern,et al. Learning Finite State Representations of Recurrent Policy Networks , 2019, ICLR.
 Richard J. Duro,et al. Open-Ended Learning: A Conceptual Framework Based on Representational Redescription , 2018, Front. Neurorobot..
 Romain Laroche,et al. Hybrid Reward Architecture for Reinforcement Learning , 2017, NIPS.
 Tommi S. Jaakkola,et al. Towards Robust Interpretability with Self-Explaining Neural Networks , 2018, NeurIPS.
 Richard J. Duro,et al. DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics , 2020, ArXiv.
 Francisco Herrera,et al. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI , 2020, Inf. Fusion.