Measurement and Fairness

We propose measurement modeling from the quantitative social sciences as a framework for understanding fairness in computational systems. Computational systems often involve unobservable theoretical constructs, such as socioeconomic status, teacher effectiveness, and risk of recidivism. Such constructs cannot be measured directly and must instead be inferred from measurements of observable properties (and other unobservable theoretical constructs) thought to be related to them---i.e., operationalized via a measurement model. This process, which necessarily involves making assumptions, introduces the potential for mismatches between the theoretical understanding of the construct purported to be measured and its operationalization. We argue that many of the harms discussed in the literature on fairness in computational systems are direct results of such mismatches. We show how some of these harms could have been anticipated and, in some cases, mitigated if viewed through the lens of measurement modeling. To do this, we contribute fairness-oriented conceptualizations of construct reliability and construct validity that unite traditions from political science, education, and psychology and provide a set of tools for making explicit and testing assumptions about constructs and their operationalizations. We then turn to fairness itself, an essentially contested construct that has different theoretical understandings in different contexts. We argue that this contestedness underlies recent debates about fairness definitions: although these debates appear to be about different operationalizations, they are, in fact, debates about different theoretical understandings of fairness. We show how measurement modeling can provide a framework for getting to the core of these debates.

[1]  J. Loevinger Objective Tests as Instruments of Psychological Theory , 1957 .

[2]  Susy Macqueen,et al.  Validity , 1973, Just Algorithms.

[3]  W. B. Gallie Essentially Contested Concepts , 1994, The Importance of Language.

[4]  N. Anderson Notes From The Front Line , 1994 .

[5]  Helen Nissenbaum,et al.  Bias in computer systems , 1996, TOIS.

[6]  M. Strathern ‘Improving ratings’: audit in the British University system , 1997, European Review.

[7]  S. Messick Test Validity: A Matter of Consequence , 1998 .

[8]  S. Sireci The Construct of Content Validity , 1998 .

[9]  Susan Leigh Star,et al.  Sorting Things Out: Classification and Its Consequences , 1999 .

[10]  K. Alder The measure of all things : the seven-year odyssey that transformed the world , 2002 .

[11]  Michael Seadle Measurement , 2007, The Measurement of Information Integrity.

[12]  Jeffrey T. Hancock,et al.  Separating Fact From Fiction: An Examination of Deceptive Self-Presentation in Online Dating Profiles , 2008, Personality & social psychology bulletin.

[13]  A. Amrein-Beardsley Methodological Concerns About the Education Value-Added Assessment System , 2008 .

[14]  Henry E. Brady,et al.  The Oxford Handbook of Political Methodology , 2010 .

[15]  Dragomir R. Radev,et al.  How to Analyze Political Attention with Minimal Assumptions and Costs , 2010 .

[16]  R. Aparasu Measurement theory and practice , 2010 .

[17]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[18]  SAS EVAAS for K-12 Statistical Models , 2015 .

[19]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[20]  Nick Doty,et al.  Privacy is an essentially contested concept: a multi-dimensional analytic for mapping privacy , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[21]  Seth Neel,et al.  Rawlsian Fairness for Machine Learning , 2016, ArXiv.

[22]  K. Lum,et al.  To predict and serve? , 2016 .

[23]  Cathy O'Neil Weapons of Math Destruction , 2016 .

[24]  Aaron Roth,et al.  Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.

[25]  Seth Neel,et al.  Fair Algorithms for Infinite and Contextual Bandits , 2016, 1610.09559.

[26]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[27]  David J. Hand,et al.  Measurement: A Very Short Introduction , 2016 .

[28]  J. Reidenberg,et al.  Accountable Algorithms , 2016 .

[29]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[30]  Brian Larson,et al.  Gender as a Variable in Natural-Language Processing: Ethical Considerations , 2017, EthNLP@EACL.

[31]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[32]  Sendhil Mullainathan,et al.  Machine Learning: An Applied Econometric Approach , 2017, Journal of Economic Perspectives.

[33]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[34]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[35]  A. Roberts Arrests As Guilt , 2018 .

[36]  Richard J. Arneson Four Conceptions of Equal Opportunity , 2018 .

[37]  Solon Barocas,et al.  The Intuitive Appeal of Explainable Machines , 2018 .

[38]  Solon Barocas,et al.  Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions , 2018, 1811.07867.

[39]  Alexandra Chouldechova,et al.  The Frontiers of Fairness in Machine Learning , 2018, ArXiv.

[40]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[41]  Jure Leskovec,et al.  Human Decisions and Machine Predictions , 2017, The quarterly journal of economics.

[42]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[43]  Laurel Eckhouse,et al.  Layers of Bias: A Unified Approach for Understanding Problems With Risk Assessment , 2018, Criminal Justice and Behavior.

[44]  Bo Cowgill The Impact of Algorithms on Judicial Discretion : Evidence from Regression Discontinuities , 2018 .

[45]  M. Stevenson,et al.  Assessing Risk Assessment in Action , 2018 .

[46]  Sandra G. Mayson Bias In, Bias Out , 2018 .

[47]  Sebastian Benthall,et al.  Racial categories in machine learning , 2018, FAT.

[48]  Jake Goldenfein,et al.  The Profiling Potential of Computer Vision and the Challenge of Computational Empiricism , 2019, FAT.

[49]  Emre Kıcıman,et al.  Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries , 2018, Front. Big Data.

[50]  Miroslav Dudík,et al.  Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need? , 2018, CHI.

[51]  Michela Paganini,et al.  The Scientific Method in the Science of Machine Learning , 2019, ArXiv.

[52]  BEN GREEN,et al.  The Principles and Limits of Algorithm-in-the-Loop Decision Making , 2019, Proc. ACM Hum. Comput. Interact..

[53]  Solon Barocas,et al.  Problem Formulation and Fairness , 2019, FAT.

[54]  Suresh Venkatasubramanian,et al.  A comparative study of fairness-enhancing interventions in machine learning , 2018, FAT.

[55]  Brian W. Powers,et al.  Dissecting racial bias in an algorithm used to manage the health of populations , 2019, Science.

[56]  Sendhil Mullainathan,et al.  Dissecting Racial Bias in an Algorithm that Guides Health Decisions for 70 Million People , 2019, FAT.

[57]  Ruha Benjamin Race After Technology: Abolitionist Tools for the New Jim Code , 2019, Social Forces.

[58]  Danah Boyd,et al.  Fairness and Abstraction in Sociotechnical Systems , 2019, FAT.

[59]  Jon M. Kleinberg,et al.  Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability , 2018, EC.

[60]  Ben Hutchinson,et al.  50 Years of Test (Un)fairness: Lessons for Machine Learning , 2018, FAT.

[61]  Ben Green,et al.  The false promise of risk assessments: epistemic reform and the limits of fairness , 2020, FAT*.

[62]  A. Hoffmann Rawls, Information Technology, and the Sociotechnical Bases of Self-Respect , 2020 .

[63]  Cynthia L. Bennett,et al.  What is the point of fairness? , 2020, ACM SIGACCESS Access. Comput..

[64]  S. Merz Race after technology. Abolitionist tools for the new Jim Code , 2020, Ethnic and Racial Studies.

[65]  Andrew Smart,et al.  Extending the Machine Learning Abstraction Boundary: A Complex Systems Approach to Incorporate Societal Context , 2020, ArXiv.

[66]  Emily Denton,et al.  Towards a critical race methodology in algorithmic fairness , 2019, FAT*.

[67]  Audrey Amrein-Beardsley,et al.  Methodological Concerns About the Education Value-Added Assessment System (EVAAS): Validity, Reliability, and Bias , 2020 .

[68]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[69]  Daniel G. Goldstein,et al.  Manipulating and Measuring Model Interpretability , 2018, CHI.

[70]  Suresh Venkatasubramanian,et al.  On the (im)possibility of fairness , 2016, ArXiv.