Evaluation Metrics

• Pairwise Rank Agreement (PRA) metric captures if the relative ordering of every pair of features is the same for a given post hoc explanation as well as the corresponding ground truth explanation i.e., if feature A is more important than B according to one explanation, then the same should be true for the other explanation. More specifically, this metric computes the fraction of feature pairs for which the relative ordering is the same between the two explanations.

[1]  R. Tibshirani,et al.  Classification by Set Cover: The Prototype Vector Machine , 2009, 0908.2284.

[2]  I-Cheng Yeh,et al.  The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[3]  G. Kazai INitiative for the Evaluation of XML Retrieval , 2009, Encyclopedia of Database Systems.

[4]  Benjamin Piwowarski,et al.  Expected Ratio of Relevant Units: A Measure for Structured Information Retrieval , 2008 .

[5]  Benjamin Piwowarski,et al.  Precision recall with user modeling (PRUM): Application to structured information retrieval , 2007, TOIS.

[6]  Gabriella Kazai,et al.  Choosing an Ideal Recall-Base for the Evaluation of the Focused Task: Sensitivity Analysis of the XCG Evaluation Measures , 2006, INEX.

[7]  Gabriella Kazai,et al.  Evaluating the effectiveness of content-oriented XML retrieval , 2003 .

[8]  Gabriella Kazai,et al.  eXtended cumulated gain measures for the evaluation of content-oriented XML retrieval , 2006, TOIS.

[9]  Benjamin Piwowarski,et al.  Measurement, Theory , 2022 .

[10]  Jovan Pehcevski,et al.  Evaluation of Effective XML Information Retrieval , 2006 .

[11]  James A. Thom,et al.  HiXEval: Highlighting XML Retrieval Evaluation , 2005, INEX.

[12]  Gabriella Kazai,et al.  Notes on what to measure in INEX , 2005 .

[13]  Gabriella Kazai,et al.  The overlap problem in content-oriented XML retrieval evaluation , 2004, SIGIR '04.

[14]  Gabriella Kazai,et al.  Tolerance to Irrelevance: A User-effort Evaluation of Retrieval Systems without Predefined Retrieval Unit , 2004, RIAO Conference.

[15]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[16]  Vijay V. Raghavan,et al.  A critical investigation of recall and precision as measures of retrieval system performance , 1989, TOIS.

[17]  Richard S. Johannes,et al.  Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus , 1988 .

[18]  A. Potts Recommended Reading. , 2019, Journal of periodontology.