Contrastive Explanations for Large Errors in Retail Forecasting Predictions through Monte Carlo Simulations

At Ahold Delhaize, there is an interest in using more complex machine learning techniques for sales forecasting. It is difficult to convince analysts, along with their superiors, to adopt these techniques since the models are considered to be 'black boxes,' even if they perform better than current models in use. We aim to explore the impact of contrastive explanations about large errors on users' attitudes towards a 'black-box' model. In this work, we make two contributions. The first is an algorithm, Monte Carlo Bounds for Reasonable Predictions (MC-BRP). Given a large error, MC-BRP determines (1) feature values that would result in a reasonable prediction, and (2) general trends between each feature and the target, based on Monte Carlo simulations. The second contribution is the evaluation of MC-BRP along with its outcomes, which has both objective and subjective components. We evaluate on a real dataset with real users from Ahold Delhaize by conducting a user study to determine if explanations generated by MC-BRP help users understand why a prediction results in a large error, and if this promotes trust in an automatically-learned model. The study shows that users are able to answer objective questions about the model's predictions with overall 81.7% accuracy when provided with these contrastive explanations. We also show that users who saw MC-BRP explanations understand why the model makes large errors in predictions significantly more than users in the control group.

[1]  J. A. Adams,et al.  Psychological bulletin. , 1962, Psychological bulletin.

[2]  D. Hilton Social Attribution and Explanation , 2017 .

[3]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[4]  Been Kim,et al.  Considerations for Evaluation and Generalization in Interpretable Machine Learning , 2018 .

[5]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[6]  Berkeley J. Dietvorst,et al.  Algorithm Aversion: People Erroneously Avoid Algorithms after Seeing Them Err , 2014, Journal of experimental psychology. General.

[7]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[8]  M. de Rijke,et al.  Finding Influential Training Samples for Gradient Boosted Decision Trees , 2018, ICML.

[9]  D. Hilton Knowledge-Based Causal Attribution : The Abnormal Conditions Focus Model , 2004 .

[10]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[11]  P. Laguna,et al.  Signal Processing , 2002, Yearbook of Medical Informatics.

[12]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[13]  D. Hilton Conversational processes and causal explanation. , 1990 .

[14]  A. Lo,et al.  Consumer Credit Risk Models Via Machine-Learning Algorithms , 2010 .

[15]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[16]  M. de Rijke,et al.  Do News Consumers Want Explanations for Personalized News Rankings , 2017 .

[17]  Michael R. Waldmann,et al.  The Oxford handbook of causal reasoning , 2017 .

[18]  Michael J. Campbell,et al.  Statistics at Square One , 1976, British medical journal.

[19]  Tim Miller,et al.  Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[20]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[21]  Tim Miller,et al.  Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences , 2017, ArXiv.