Achieving Transparency Report Privacy in Linear Time

An accountable algorithmic transparency report (ATR) should ideally investigate the (a) transparency of the underlying algorithm, and (b) fairness of the algorithmic decisions, and at the same time preserve data subjects’ privacy. However, a provably formal study of the impact to data subjects’ privacy caused by the utility of releasing an ATR (that investigates transparency and fairness), is yet to be addressed in the literature. The far-fetched benefit of such a study lies in the methodical characterization of privacy-utility trade-offs for release of ATRs in public, and their consequential application-specific impact on the dimensions of society, politics, and economics. In this paper, we first investigate and demonstrate potential privacy hazards brought on by the deployment of transparency and fairness measures in released ATRs. To preserve data subjects’ privacy, we then propose a linear-time optimal-privacy scheme, built upon standard linear fractional programming (LFP) theory, for announcing ATRs, subject to constraints controlling the tolerance of privacy perturbation on the utility of transparency schemes. Subsequently, we quantify the privacy-utility trade-offs induced by our scheme, and analyze the impact of privacy perturbation on fairness measures in ATRs. To the best of our knowledge, this is the first analytical work that simultaneously addresses trade-offs between the triad of privacy, utility, and fairness, applicable to algorithmic transparency reports.

[1]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[2]  L. Floridi,et al.  Data ethics , 2021, Effective Directors.

[3]  Philip S. Yu,et al.  Handicapping attacker's confidence: an alternative to k-anonymization , 2006, Knowledge and Information Systems.

[4]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[5]  Gunnar Rätsch,et al.  The Feature Importance Ranking Measure , 2009, ECML/PKDD.

[6]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[7]  L. Sweeney Simple Demographics Often Identify People Uniquely , 2000 .

[8]  Dan A. Biddle Adverse Impact and Test Validation: A Practitioner's Guide to Valid and Defensible Employment Testing , 2005 .

[9]  Mike Ananny,et al.  Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability , 2018, New Media Soc..

[10]  Sanjay Krishnan,et al.  PALM: Machine Learning Explanations For Iterative Debugging , 2017, HILDA@SIGMOD.

[11]  Emil Pitkin,et al.  Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation , 2013, 1309.6392.

[12]  Michael P. Wellman,et al.  Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[13]  S. R. Searle,et al.  On Deriving the Inverse of a Sum of Matrices , 1981 .

[14]  Christian Biemann,et al.  What do we need to build explainable AI systems for the medical domain? , 2017, ArXiv.

[15]  Benjamin C. M. Fung,et al.  Integrating Private Databases for Data Analysis , 2005, ISI.

[16]  Fred H. Cate,et al.  The Right to Privacy and the Public's Right to Know: The "Central Purpose" of the Freedom of Information Act , 1994 .

[17]  Faisal Kamiran,et al.  Quantifying explainable discrimination and removing illegal discrimination in automated decision making , 2012, Knowledge and Information Systems.

[18]  Robert C. Thompson Matrix type metric inequalities , 1978 .

[19]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[20]  Bernd Carsten Stahl,et al.  Ethics and Privacy in AI and Big Data: Implementing Responsible Research and Innovation , 2018, IEEE Security & Privacy.

[21]  Bogdan E. Popescu,et al.  PREDICTIVE LEARNING VIA RULE ENSEMBLES , 2008, 0811.1679.

[22]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[23]  Florian Schaub,et al.  Achieving big data privacy in education , 2018, Theory and Research in Education.

[24]  Giles Hooker,et al.  Discovering additive structure in black box functions , 2004, KDD.

[25]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[26]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[27]  Gilles Barthe,et al.  Information-Theoretic Bounds for Differentially Private Mechanisms , 2011, 2011 IEEE 24th Computer Security Foundations Symposium.

[28]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[29]  Alex Pentland,et al.  Fair, Transparent, and Accountable Algorithmic Decision-making Processes , 2017, Philosophy & Technology.

[30]  Reza Shokri,et al.  On the Privacy Risks of Model Explanations , 2019, AIES.

[31]  Tomás E. Monarrez,et al.  Racial and Ethnic Representation in Postsecondary Education , 2020 .

[32]  Katherine Fink,et al.  Opening the government’s black boxes: freedom of information and algorithmic accountability , 2017 .

[33]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[34]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[35]  Emilee J. Rader,et al.  Explanations as Mechanisms for Supporting Algorithmic Transparency , 2018, CHI.

[36]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[37]  N. Diakopoulos Algorithmic Accountability Reporting: On the Investigation of Black Boxes , 2014 .

[38]  Geoffrey Smith,et al.  Min-entropy as a resource , 2013, Inf. Comput..

[39]  Stephen P. Boyd,et al.  Notes on Decomposition Methods , 2008 .

[40]  Alessandro Acquisti,et al.  Predicting Social Security numbers from public data , 2009, Proceedings of the National Academy of Sciences.

[41]  Filip Karlo Dosilovic,et al.  Explainable artificial intelligence: A survey , 2018, 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[42]  Julius Adebayo,et al.  CREDIT SCORING IN THE ERA OF BIG DATA , 2017 .

[43]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[44]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[45]  Feiyang Sun,et al.  Beyond Open vs. Closed: Balancing Individual Privacy and Public Accountability in Data Sharing , 2019, FAT.

[46]  Mário S. Alvim,et al.  Measuring Information Leakage Using Generalized Gain Functions , 2012, 2012 IEEE 25th Computer Security Foundations Symposium.

[47]  Erik Strumbelj,et al.  An Efficient Explanation of Individual Classifications using Game Theory , 2010, J. Mach. Learn. Res..

[48]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[49]  Harris Mateen Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2018 .

[50]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[51]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[52]  S. Zionts,et al.  Programming with linear fractional functionals , 1968 .

[53]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[54]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[55]  Brandon M. Greenwell,et al.  A Simple and Effective Model-Based Variable Importance Measure , 2018, ArXiv.

[56]  Massimo Barbaro,et al.  A Face Is Exposed for AOL Searcher No , 2006 .

[57]  Cynthia Rudin,et al.  Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the "Rashomon" Perspective , 2018 .

[58]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[59]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..

[60]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[61]  Flávio du Pin Calmon,et al.  Privacy against statistical inference , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[62]  L Sweeney,et al.  Weaving Technology and Policy Together to Maintain Confidentiality , 1997, Journal of Law, Medicine & Ethics.

[63]  Sakthi Balan Muthiah,et al.  Algorithmic Privacy and Gender Bias Issues in Google Ad Settings , 2019, WebSci.

[64]  Robert H. Sloan,et al.  When Is an Algorithm Transparent? Predictive Analytics, Privacy, and Public Policy , 2018, IEEE Security & Privacy.

[65]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[66]  Wojciech Samek,et al.  Explainable AI: Interpreting, Explaining and Visualizing Deep Learning , 2019, Explainable AI.

[67]  F. Kamali,et al.  Pharmacogenetics of warfarin. , 2010, Annual review of medicine.

[68]  Kristina Lerman,et al.  A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[69]  Leana Golubchik,et al.  Oblivious Mechanisms in Differential Privacy: Experiments, Conjectures, and Open Questions , 2016, 2016 IEEE Security and Privacy Workshops (SPW).

[70]  Tony Doyle,et al.  Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2017, Inf. Soc..

[71]  Marco Guarnieri,et al.  Synthesis of Probabilistic Privacy Enforcement , 2017, CCS.

[72]  Mário S. Alvim,et al.  Differential Privacy: On the Trade-Off between Utility and Information Leakage , 2011, Formal Aspects in Security and Trust.

[73]  Michael D. Ekstrand,et al.  Privacy for All: Ensuring Fair and Equitable Privacy Protections , 2018, FAT.

[74]  Daniel W. Apley,et al.  Visualizing the effects of predictor variables in black box supervised learning models , 2016, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[75]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[76]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[77]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[78]  Jianneng Cao,et al.  Publishing Microdata with a Robust Privacy Guarantee , 2012, Proc. VLDB Endow..

[79]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..