Towards Trustworthiness in the Context of Explainable Search

Explainable AI (XAI) is currently a vibrant research topic. However, the absence of ground truth explanations makes it difficult to evaluate XAI systems such as Explainable Search. We present an Explainable Search system with a focus on evaluating the XAI aspect of Trustworthiness along with the retrieval performance. We present SIMFIC 2.0 (Similarity in Fiction), an enhanced version of a recent explainable search system. The system retrieves books similar to a selected book in a query-by-example setting. The motivation is to explain the notion of similarity in fiction books. We extract hand-crafted interpretable features for fiction books and provide global explanations by fitting a linear regression and local explanations based on similarity measures. The Trustworthiness facet is evaluated using user studies, while the ranking performance is compared by analysis of user clicks. Eye tracking is used to investigate user attention to the explanation elements when interacting with the interface. Initial experiments show statistically significant results on the Trustworthiness of the system, paving way for interesting research directions that are being investigated.

[1]  Gary Klein,et al.  Metrics for Explainable AI: Challenges and Prospects , 2018, ArXiv.

[2]  Avishek Anand,et al.  A study on the Interpretability of Neural Retrieval Models using DeepSHAP , 2019, SIGIR.

[3]  John Riedl,et al.  Explaining collaborative filtering recommendations , 2000, CSCW '00.

[4]  Derek Ruths,et al.  Mr. Bennet, his coachman, and the Archbishop walk into a bar but only one of them gets recognized: On The Difficulty of Detecting Characters in Literary Texts , 2015, EMNLP.

[5]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[6]  Eric D. Ragan,et al.  A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems , 2018, ACM Trans. Interact. Intell. Syst..

[7]  Karin Coninx,et al.  PervasiveCrystal: Asking and Answering Why and Why Not Questions about Pervasive Computing Applications , 2010, 2010 Sixth International Conference on Intelligent Environments.

[8]  Thorsten Joachims,et al.  Evaluating Retrieval Performance Using Clickthrough Data , 2003, Text Mining.

[9]  G. N. Lance,et al.  Computer Programs for Hierarchical Polythetic Classification ("Similarity Analyses") , 1966, Comput. J..

[10]  Avishek Anand,et al.  EXS: Explainable Search Using Local Model Agnostic Interpretability , 2018, WSDM.

[11]  Andreas Nürnberger,et al.  SIMFIC: An Explainable Book Search Companion , 2020, 2020 IEEE International Conference on Human-Machine Systems (ICHMS).

[12]  Dan Conway,et al.  How to Recommend?: User Trust Factors in Movie Recommender Systems , 2017, IUI.

[13]  Yifan Ge,et al.  Classification of Book Genres By Cover and Title , 2015 .

[14]  Weng-Keen Wong,et al.  Too much, too little, or just right? Ways explanations impact end users' mental models , 2013, 2013 IEEE Symposium on Visual Languages and Human Centric Computing.

[15]  Masooda Bashir,et al.  Trust in Automation , 2015, Hum. Factors.

[16]  Klaus-Robert Müller,et al.  Layer-Wise Relevance Propagation: An Overview , 2019, Explainable AI.

[17]  Andreas Nürnberger,et al.  Usability and perception of young users and adults on targeted web search engines , 2014, IIiX.

[18]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[19]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[20]  Andreas Nürnberger,et al.  A comparative study about children's and adults' perception of targeted web search engines , 2014, CHI.

[21]  Diana Inkpen,et al.  Study of linguistic features incorporated in a literary book recommender system , 2019, SAC.

[22]  Matthew L. Jockers Macroanalysis: Digital Methods and Literary History , 2013 .