论文信息 - Keeping Designers in the Loop: Communicating Inherent Algorithmic Trade-offs Across Multiple Objectives

Keeping Designers in the Loop: Communicating Inherent Algorithmic Trade-offs Across Multiple Objectives

Artificial intelligence algorithms have been used to enhance a wide variety of products and services, including assisting human decision making in high-stake contexts. However, these algorithms are complex and have trade-offs, notably between prediction accuracy and fairness to population subgroups. This makes it hard for designers to understand algorithms and design products or services in a way that respects users' goals, values, and needs. We proposed a method to help designers and users explore algorithms, visualize their trade-offs, and select algorithms with trade-offs consistent with their goals and needs. We evaluated our method on the problem of predicting criminal defendants' likelihood to re-offend through (i) a large-scale Amazon Mechanical Turk experiment, and (ii) in-depth interviews with domain experts. Our evaluations show that our method can help designers and users of these systems better understand and navigate algorithmic trade-offs. This paper contributes a new way of providing designers the ability to understand and control the outcomes of algorithmic systems they are creating.

[1] Haiyi Zhu,et al. Explaining Decision-Making Algorithms through UI: Strategies to Help Non-Expert Stakeholders , 2019, CHI.

[2] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[3] Qian Yang,et al. Re-examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult to Design , 2020, CHI.

[4] N Moray,et al. Trust, control strategies and allocation of function in human-machine systems. , 1992, Ergonomics.

[5] Mark Craven,et al. Rule Extraction: Where Do We Go from Here? , 1999 .

[6] Aditya Krishna Menon,et al. The cost of fairness in binary classification , 2018, FAT.

[7] Aaron Halfaker,et al. Value-Sensitive Algorithm Design , 2018, Proc. ACM Hum. Comput. Interact..

[8] Min Kyung Lee,et al. Procedural Justice in Algorithmic Fairness , 2019, Proc. ACM Hum. Comput. Interact..

[9] Susan Wiedenbeck,et al. On-line trust: concepts, evolving themes, a model , 2003, Int. J. Hum. Comput. Stud..

[10] Seth Neel,et al. An Empirical Study of Rich Subgroup Fairness for Machine Learning , 2018, FAT.

[11] J. Coyne,et al. False positives, false negatives, and the validity of the diagnosis of major depression in primary care. , 1998, Archives of family medicine.

[12] Seth Neel,et al. Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[13] Adam Tauman Kalai,et al. Decoupled Classifiers for Group-Fair and Efficient Machine Learning , 2017, FAT.

[14] Jennifer L. Skeem,et al. Risk, Race, & Recidivism: Predictive Bias and Disparate Impact , 2016 .

[15] Ariel D. Procaccia,et al. WeBuildAI: Participatory Framework for Fair and Efﬁcient Algorithmic Governance , 2018 .

[16] P.-C.-F. Daunou,et al. Mémoire sur les élections au scrutin , 1803 .

[17] David Mease,et al. Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers , 2015, J. Mach. Learn. Res..

[18] Lauren Wilcox,et al. "Hello AI": Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making , 2019, Proc. ACM Hum. Comput. Interact..

[19] Philip van Allen,et al. Prototyping ways of prototyping AI , 2018, Interactions.

[20] Shan Carter,et al. Using Artificial Intelligence to Augment Human Intelligence , 2017 .

[21] Karen Holtzblatt,et al. Rapid Contextual Design: A How-To Guide to Key Techniques for User-Centered Design , 2004, UBIQ.

[22] Jonathan C. Roberts,et al. Visual comparison for information visualization , 2011, Inf. Vis..

[23] Alexandra Chouldechova,et al. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[24] Jure Leskovec,et al. Interpretable Decision Sets: A Joint Framework for Description and Prediction , 2016, KDD.

[25] Jon M. Kleinberg,et al. On Fairness and Calibration , 2017, NIPS.

[26] Desney S. Tan,et al. EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers , 2009, CHI.

[27] John Zimmerman,et al. Investigating How Experienced UX Designers Effectively Work with Machine Learning , 2018, Conference on Designing Interactive Systems.

[28] Kenney Ng,et al. Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models , 2016, CHI.

[29] John Langford,et al. A Reductions Approach to Fair Classification , 2018, ICML.

[30] Katharina Seewald,et al. International Perspectives on the Practical Application of Violence Risk Assessment: A Global Survey of 44 Countries , 2014 .

[31] Christopher T. Lowenkamp,et al. RISK, RACE, AND RECIDIVISM: PREDICTIVE BIAS AND DISPARATE IMPACT*: RISK, RACE, AND RECIDIVISM , 2016 .

[32] Alexandra Chouldechova,et al. A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions , 2018, FAT.

[33] Or Biran,et al. Explanation and Justification in Machine Learning : A Survey Or , 2017 .

[34] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.

[35] Erez Shmueli,et al. Algorithmic Fairness , 2020, ArXiv.

[36] G. Simmel. The sociology of Georg Simmel , 1950 .

[37] John Zimmerman,et al. Planning Adaptive Mobile Experiences When Wireframing , 2016, Conference on Designing Interactive Systems.

[38] A. Meade,et al. Identifying careless responses in survey data. , 2012, Psychological methods.

[39] Lalana Kagal,et al. Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[40] COMPAS Risk Scales : Demonstrating Accuracy Equity and Predictive Parity Performance of the COMPAS Risk Scales in Broward County , 2016 .

[41] Paul N. Bennett,et al. Guidelines for Human-AI Interaction , 2019, CHI.

[42] Avi Feller,et al. Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[43] D. A. Andrews,et al. The Recent Past and Near Future of Risk and/or Need Assessment , 2006 .

[44] Hany Farid,et al. The accuracy, fairness, and limits of predicting recidivism , 2018, Science Advances.

[45] HoltzblattKaren,et al. Rapid Contextual Design , 2005 .

[46] Jon M. Kleinberg,et al. Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[47] Mohan S. Kankanhalli,et al. Trends and Trajectories for Explainable, Accountable and Intelligible Systems: An HCI Research Agenda , 2018, CHI.

[48] Michael Veale,et al. Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making , 2018, CHI.

[49] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[50] Julia Roberts,et al. Measurement of information and communication technology experience and attitudes to e-learning of students in the healthcare professions: integrative review. , 2009, Journal of advanced nursing.

[51] Kim Halskov,et al. UX Design Innovation: Challenges for Working with Machine Learning as a Design Material , 2017, CHI.

[52] Carlos Guestrin,et al. Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[53] James A. Landay,et al. Development and evaluation of emerging design patterns for ubiquitous computing , 2004, DIS '04.