Certifying Robustness to Programmable Data Bias in Decision Trees

Datasets can be biased due to societal inequities, human biases, underrepresentation of minorities, etc. Our goal is to certify that models produced by a learning algorithm are pointwise-robust to potential dataset biases. This is a challenging problem: it entails learning models for a large, or even infinite, number of datasets, ensuring that they all produce the same prediction. We focus on decision-tree learning due to the interpretable nature of the models. Our approach allows programmatically specifying bias models across a variety of dimensions (e.g., missing data for minorities), composing types of bias, and targeting bias towards a specific group. To certify robustness, we use a novel symbolic technique to evaluate a decision-tree learner on a large, or infinite, number of datasets, certifying that each and every dataset produces the same prediction for a specific test point. We evaluate our approach on datasets that are commonly used in the fairness literature, and demonstrate our approach’s viability on a range of bias models.

[1]  Daniel Kuhn,et al.  Distributionally Robust Logistic Regression , 2015, NIPS.

[2]  Xiaoyu Cao,et al.  Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks , 2020, AAAI.

[3]  Luis Muñoz-González,et al.  Label Sanitization against Label Flipping Poisoning Attacks , 2018, Nemesis/UrbReas/SoGood/IWAISe/GDM@PKDD/ECML.

[4]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[5]  Timon Gehr,et al.  An abstract domain for certifying neural networks , 2019, Proc. ACM Program. Lang..

[6]  Claudia Eckert,et al.  Adversarial Label Flips Attack on Support Vector Machines , 2012, ECAI.

[7]  Percy Liang,et al.  Certified Defenses for Data Poisoning Attacks , 2017, NIPS.

[8]  Matt Fredrikson,et al.  Leave-one-out Unfairness , 2021, FAccT.

[9]  Robert C. Holte,et al.  Decision Tree Instability and Active Learning , 2007, ECML.

[10]  A. Gorban,et al.  The Five Factor Model of personality and evaluation of drug consumption risk , 2015, 1506.06297.

[11]  Aws Albarghouthi,et al.  Proving data-poisoning robustness in decision trees , 2020, PLDI.

[12]  Aws Albarghouthi,et al.  Introduction to Neural Network Verification , 2021, Found. Trends Program. Lang..

[13]  Daniel M. Kane,et al.  Recent Advances in Algorithmic High-Dimensional Robust Statistics , 2019, ArXiv.

[14]  Jinfeng Yi,et al.  Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[15]  Peter D. Turney Technical note: Bias and the quantification of stability , 1995, Machine Learning.

[16]  Matthias Hein,et al.  Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks , 2019, NeurIPS.

[17]  Geneva G. Belford,et al.  Instability of decision tree classification algorithms , 2001, KDD.

[18]  Anja De Waegenaere,et al.  Robust Solutions of Optimization Problems Affected by Uncertain Probabilities , 2011, Manag. Sci..

[19]  John C. Duchi,et al.  Stochastic Gradient Methods for Distributionally Robust Optimization with f-divergences , 2016, NIPS.

[20]  Daniel M. Kane,et al.  Robustness meets algorithms , 2021, Commun. ACM.

[21]  J. Z. Kolter,et al.  Certified Robustness to Label-Flipping Attacks via Randomized Smoothing , 2020, ICML.

[22]  David Sontag,et al.  Why Is My Classifier Discriminatory? , 2018, NeurIPS.

[23]  Cho-Jui Hsieh,et al.  Robust Decision Trees Against Adversarial Examples , 2019 .

[24]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[25]  Samy Bengio,et al.  Understanding deep learning (still) requires rethinking generalization , 2021, Commun. ACM.

[26]  Aws Albarghouthi,et al.  Robustness to Programmable String Transformations via Augmented Abstract Training , 2020, ICML.

[27]  Francesco Ranzato,et al.  Abstract Interpretation of Decision Tree Ensemble Classifiers , 2020, AAAI.

[28]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[29]  Swarat Chaudhuri,et al.  AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[30]  Cynthia Rudin,et al.  Why Are We Using Black Box Models in AI When We Don’t Need To? A Lesson From An Explainable AI Competition , 2019, 1.2.

[31]  Alexander Levine,et al.  Deep Partition Aggregation: Provable Defense against General Poisoning Attacks , 2020, ICLR.

[32]  Isil Dillig,et al.  Optimization and abstraction: a synergistic approach for analyzing neural network robustness , 2019, PLDI.

[33]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[34]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[35]  Simin Nadjm-Tehrani,et al.  An Abstraction-Refinement Approach to Formal Verification of Tree Ensembles , 2019, SAFECOMP Workshops.

[36]  Jerry Li,et al.  Sever: A Robust Meta-Algorithm for Stochastic Optimization , 2018, ICML.

[37]  Jinyuan Jia,et al.  Certified Robustness of Nearest Neighbors against Data Poisoning Attacks , 2020, ArXiv.

[38]  Jeannette M. Wing,et al.  Ensuring Fairness Beyond the Training Data , 2020, NeurIPS.