Collaborative Explanation of Deep Models with Limited Interaction for Trade Secret and Privacy Preservation

An ever increasing number of decisions affecting our lives are made by algorithms. For this reason, algorithmic transparency is becoming a pressing need: automated decisions should be explainable and unbiased. A straightforward solution is to make the decision algorithms open-source, so that everyone can verify them and reproduce their outcome. However, in many situations, the source code or the training data of algorithms cannot be published for industrial or intellectual property reasons, as they are the result of long and costly experience (e.g. this is typically the case in banking or insurance). We present an approach whereby individual subjects on whom automated decisions are made can elicit in a collaborative and privacy-preserving manner a rule-based approximation of the model underlying the decision algorithm, based on limited interaction with the algorithm or even only on how they have been classified. Furthermore, being rule-based, the approximation thus obtained can be used to detect potential discrimination. We present empirical work to demonstrate the practicality of our ideas.

[1]  Josep Domingo-Ferrer,et al.  Generalization-based privacy preservation and discrimination prevention in data publishing and mining , 2014, Data Mining and Knowledge Discovery.

[2]  Klaus-Robert Müller,et al.  Feature Importance Measure for Non-linear Learning Algorithms , 2016, ArXiv.

[3]  A. Chaudhuri,et al.  Randomized Response: Theory and Techniques , 1987 .

[4]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[5]  Josep Domingo-Ferrer,et al.  Co-Utility: Self-Enforcing protocols for the mutual benefit of participants , 2017, Eng. Appl. Artif. Intell..

[6]  Trevor Darrell,et al.  Attentive Explanations: Justifying Decisions and Pointing to the Evidence , 2016, ArXiv.

[7]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[8]  Franco Turini,et al.  Discrimination-aware data mining , 2008, KDD.

[9]  Ryan Turner,et al.  A model explanation system , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[10]  Ninja Marnau,et al.  Comments on the “Draft Ethics Guidelines for Trustworthy AI” by the High-LevelExpert Group on Artificial Intelligence. , 2019 .

[11]  Duane Szafron,et al.  Visual Explanation of Evidence with Additive Classifiers , 2006, AAAI.

[12]  High-Level Expert Group on Artificial Intelligence – Draft Ethics Guidelines for Trustworthy AI , 2019 .

[13]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[14]  Josep Domingo-Ferrer,et al.  Multiparty Computation with Statistical Input Confidentiality via Randomized Response , 2018, PSD.

[15]  Samuel Greengard,et al.  Weighing the impact of GDPR , 2018, Commun. ACM.

[16]  Carlos Guestrin,et al.  Programs as Black-Box Explanations , 2016, ArXiv.

[17]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[18]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[19]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[20]  Josep Domingo-Ferrer,et al.  Discrimination- and privacy-aware patterns , 2014, Data Mining and Knowledge Discovery.

[21]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[22]  Franco Turini,et al.  Measuring Discrimination in Socially-Sensitive Decision Records , 2009, SDM.

[23]  Josep Domingo-Ferrer,et al.  A Methodology for Direct and Indirect Discrimination Prevention in Data Mining , 2013, IEEE Transactions on Knowledge and Data Engineering.