Interpretable Summaries of Black Box Incident Triaging with Subgroup Discovery

The need of predictive maintenance comes with an increasing number of incidents reported by monitoring systems and equipment/software users. In the front line, on-call engineers (OCEs) have to quickly assess the degree of severity of an incident and decide which service to contact for corrective actions. To automate these decisions, several predictive models have been proposed, but the most efficient models are opaque (say, black box), strongly limiting their adoption. In this paper, we propose an efficient black box model based on 170K incidents reported to our company over the last 7 years and emphasize on the need of automating triage when incidents are massively reported on thousands of servers running our product, an ERP. Recent developments in eXplainable Artificial Intelligence (XAI) help in providing global explanations to the model, but also, and most importantly, with local explanations for each model prediction/outcome. Sadly, providing a human with an explanation for each outcome is not conceivable when dealing with an important number of daily predictions. To address this problem, we propose an original data-mining method rooted in Subgroup Discovery, a pattern mining technique with the natural ability to group objects that share similar explanations of their black box predictions and provide a description for each group. We evaluate this approach and present our preliminary results which give us good hope towards an effective OCE's adoption. We believe that this approach provides a new way to address the problem of model agnostic outcome explanation.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  Senlin Luo,et al.  Rule Extraction From Support Vector Machines Using Ensemble Learning Approach: An Application for Diagnosis of Diabetes , 2015, IEEE Journal of Biomedical and Health Informatics.

[3]  Anna Monreale,et al.  Investigating Neighborhood Generation Methods for Explanations of Obscure Image Classifiers , 2019, PAKDD.

[4]  Jure Leskovec,et al.  GNNExplainer: Generating Explanations for Graph Neural Networks , 2019, NeurIPS.

[5]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[6]  Amedeo Napoli,et al.  Revisiting Numerical Pattern Mining with Formal Concept Analysis , 2011, IJCAI.

[7]  Bernhard Ganter,et al.  Pattern Structures and Their Projections , 2001, ICCS.

[8]  Olcay Boz,et al.  Extracting decision trees from trained neural networks , 2002, KDD.

[9]  Satoshi Hara,et al.  Making Tree Ensembles Interpretable , 2016, 1606.05390.

[10]  M. Boley,et al.  Uncovering structure-property relationships of materials by subgroup discovery , 2016, 1612.04307.

[11]  Guillaume Bosc,et al.  Chemical features mining provides new descriptive structure-odor relationships , 2019, PLoS Comput. Biol..

[12]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[13]  Navendu Jain,et al.  DeepTriage: Automated Transfer Assistance for Incidents in Cloud Services , 2020, KDD.

[14]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[15]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[16]  Chan-Gun Lee,et al.  Applying deep learning based automatic bug triager to industrial projects , 2017, ESEC/SIGSOFT FSE.

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  T. Kathirvalavakumar,et al.  Reverse Engineering the Neural Networks for Rule Extraction in Classification Problems , 2011, Neural Processing Letters.

[19]  Stan Matwin,et al.  Black Box Explanation by Learning Image Exemplars in the Latent Feature Space , 2019, ECML/PKDD.

[20]  Shane Dawson,et al.  Identifying key factors of student academic performance by subgroup discovery , 2018, International Journal of Data Science and Analytics.

[21]  Yangfan Zhou,et al.  Fast Outage Analysis of Large-Scale Production Clouds with Service Correlation Mining , 2021, 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE).

[22]  Willi Klösgen,et al.  Explora: A Multipattern and Multistrategy Discovery Assistant , 1996, Advances in Knowledge Discovery and Data Mining.

[23]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[24]  Junjie Chen,et al.  Continuous Incident Triage for Large-Scale Online Service Systems , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  John W. Paisley,et al.  Global Explanations of Neural Networks: Mapping the Landscape of Predictions , 2019, AIES.

[26]  Martin Atzmüller,et al.  Subgroup discovery , 2005, Künstliche Intell..

[27]  Behnaz Arzani,et al.  Scouts: Improving the Diagnosis Process Through Domain-customized Incident Routing , 2020, SIGCOMM.