Social-minded Measures of Data Quality

For decades, research in data-driven algorithmic systems has focused on improving efficiency (making data access faster and lighter) and effectiveness (providing relevant results to users). As data-driven decision making becomes prevalent, there is an increasing need for new measures for evaluating the quality of data systems. In this article, we make the case for social-minded measures, that is, measures that evaluate the effect of a system in society. We focus on three such measures, namely diversity (ensuring that all relevant aspects are represented), lack of bias (processing data without unjustifiable concentration on a particular side), and fairness (non discriminating treatment of data and people).

[1]  Solon Barocas,et al.  Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions , 2018, 1811.07867.

[2]  Evaggelia Pitoura,et al.  DisC diversity: result diversification based on dissimilarity and coverage , 2012, Proc. VLDB Endow..

[3]  Cynthia Dwork,et al.  Fairness Under Composition , 2018, ITCS.

[4]  Georgia Koutrika,et al.  Fairness in Rankings and Recommenders , 2020, EDBT.

[5]  Suresh Venkatasubramanian,et al.  A comparative study of fairness-enhancing interventions in machine learning , 2018, FAT.

[6]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[7]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[8]  Evaggelia Pitoura,et al.  Search result diversification , 2010, SGMD.

[9]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[10]  Carlos Castillo,et al.  A Critical Review of Online Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries , 2018, WSDM.

[11]  Gerhard Weikum,et al.  Fides: Towards a Platform for Responsible Data Science , 2017, SSDBM.

[12]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[13]  E. H. Simpson Measurement of Diversity , 1949, Nature.

[14]  Jure Leskovec,et al.  Human Decisions and Machine Predictions , 2017, The quarterly journal of economics.

[15]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[16]  Aristides Gionis,et al.  Political Discourse on Social Media: Echo Chambers, Gatekeepers, and the Price of Bipartisanship , 2018, WWW.

[17]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[18]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[19]  Evaggelia Pitoura,et al.  On Measuring Bias in Online Information , 2017, SGMD.

[20]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[21]  Evaggelia Pitoura,et al.  Diversity in Big Data: A Review , 2017, Big Data.

[22]  Suresh Venkatasubramanian,et al.  On the (im)possibility of fairness , 2016, ArXiv.