Multicalibration: Calibration for the (Computationally-Identifiable) Masses

We develop and study multicalibration as a new measure of fairness in machine learning that aims to mitigate inadvertent or malicious discrimination that is introduced at training time (even from ground truth data). Multicalibration guarantees meaningful (calibrated) predictions for every subpopulation that can be identified within a specified class of computations. The specified class can be quite rich; in particular, it can contain many overlapping subgroups of a protected group. We demonstrate that in many settings this strong notion of protection from discrimination is provably attainable and aligned with the goal of accurate predictions. Along the way, we present algorithms for learning a multicalibrated predictor, study the computational complexity of this task, and illustrate tight connections to the agnostic learning model.

[1]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[2]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[3]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[4]  R. Schapire,et al.  Toward Efficient Agnostic Learning , 1994 .

[5]  Shai Ben-David,et al.  Agnostic Boosting , 2001, COLT/EuroCOLT.

[6]  Alvaro Sandroni,et al.  Calibration with Many Checking Rules , 2003, Math. Oper. Res..

[7]  Yishay Mansour,et al.  From External to Internal Regret , 2005, J. Mach. Learn. Res..

[8]  Adam Tauman Kalai,et al.  On agnostic boosting and parity learning , 2008, STOC.

[9]  Subhash Khot,et al.  Minimizing Wide Range Regret with Time Selection Functions , 2008, COLT.

[10]  Madhur Tulsiani,et al.  Regularity, Boosting, and Efficiently Simulating Every High-Entropy Distribution , 2009, 2009 24th Annual IEEE Conference on Computational Complexity.

[11]  Vitaly Feldman,et al.  Distribution-Specific Agnostic Boosting , 2009, ICS.

[12]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[13]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[14]  Toniann Pitassi,et al.  Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.

[15]  Toniann Pitassi,et al.  Generalization in Adaptive Data Analysis and Holdout Reuse , 2015, NIPS.

[16]  Percy Liang,et al.  Calibrated Structured Prediction , 2015, NIPS.

[17]  Toniann Pitassi,et al.  The reusable holdout: Preserving validity in adaptive data analysis , 2015, Science.

[18]  Raef Bassily,et al.  Algorithmic stability for adaptive data analysis , 2015, STOC.

[19]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[20]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[21]  Benjamin Fish,et al.  A Confidence-Based Approach for Balancing Fairness and Accuracy , 2016, SDM.

[22]  Guy N. Rothblum,et al.  Calibration for the (Computationally-Identifiable) Masses , 2017, ArXiv.

[23]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[24]  Alexandra Chouldechova,et al.  Fairer and more accurate, but for whom? , 2017, ArXiv.

[25]  Seth Neel,et al.  A Convex Framework for Fair Regression , 2017, ArXiv.

[26]  Algorithmic decision making and the cost of fairness , 2017, 1701.08230.

[27]  Alon Rosen,et al.  Pseudorandom Functions: Three Decades Later , 2017, Tutorials on the Foundations of Cryptography.

[28]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[29]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[30]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[31]  Seth Neel,et al.  Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[32]  Sergiu Hart,et al.  Smooth calibration, leaky forecasts, finite recall, and Nash dynamics , 2018, Games Econ. Behav..

[33]  James Y. Zou,et al.  Multiaccuracy: Black-Box Post-Processing for Fairness in Classification , 2018, AIES.

[34]  Silvio Micali,et al.  How to construct random functions , 1986, JACM.