MithraLabel: Flexible Dataset Nutritional Labels for Responsible Data Science

Using inappropriate datasets for data science tasks can be harmful, especially for applications that impact humans. Targeting data ethics, we demonstrate MithraLabel, a system for generating task-specific information about a dataset, in the form of a set of visual widgets, as a flexible "nutritional label" that provides a user with information to determine the fitness of the dataset for the task at hand.

[1]  Evaggelia Pitoura,et al.  Diversity in Big Data: A Review , 2017, Big Data.

[2]  David Sontag,et al.  Why Is My Classifier Discriminatory? , 2018, NeurIPS.

[3]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[4]  Abolfazl Asudeh,et al.  A Nutritional Label for Rankings , 2018, SIGMOD Conference.

[5]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[6]  Dan Suciu,et al.  Capuchin: Causal Database Repair for Algorithmic Fairness , 2019, ArXiv.

[7]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[8]  Suresh Venkatasubramanian,et al.  A comparative study of fairness-enhancing interventions in machine learning , 2018, FAT.

[9]  Felix Naumann,et al.  Data profiling , 2017, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[10]  Abolfazl Asudeh,et al.  On Obtaining Stable Rankings , 2018, Proc. VLDB Endow..

[11]  Ahmed Hosny,et al.  The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards , 2018, Data Protection and Privacy.

[12]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[13]  Surajit Chaudhuri,et al.  Overview of Data Exploration Techniques , 2015, SIGMOD Conference.

[14]  Erez Shmueli,et al.  Algorithmic Fairness , 2020, ArXiv.

[15]  Abolfazl Asudeh,et al.  Assessing and Remedying Coverage for a Given Dataset , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[16]  Abolfazl Asudeh,et al.  Designing Fair Ranking Schemes , 2017, SIGMOD Conference.

[17]  Abolfazl Asudeh,et al.  Generating Preview Tables for Entity Graphs , 2014, SIGMOD Conference.

[18]  Christoph Witzgall Mathematical methods of site selection for Electronic Message Systems (EMS) , 1975 .