论文信息 - Human Decision Making with Machine Assistance

Human Decision Making with Machine Assistance

Much of political debate focuses on the concern that machines might take over. Yet in many domains it is much more plausible that the ultimate choice and responsibility remain with a human decision-maker, but that she is provided with machine advice. A quintessential illustration is the decision of a judge to bail or jail a defendant. In multiple jurisdictions in the US, judges have access to a machine prediction about a defendant's recidivism risk. In our study, we explore how receiving machine advice influences people's bail decisions. We run a vignette experiment with laypersons whom we test on a subsample of cases from the database of this prediction tool. In study 1, we ask them to predict whether defendants will recidivate before tried, and manipulate whether they have access to machine advice. We find that receiving machine advice has a small effect, which is biased in the direction of predicting no recidivism. In the field, human decision makers sometimes have a chance, after the fact, to learn whether the machine has given good advice. In study 2, after each trial we inform participants of ground truth. This does not make it more likely that they follow the advice, despite the fact that the machine is (on average) slightly more accurate than real judges. This also holds if initially the advice is mostly correct, or if it initially is mostly to predict (no) recidivism. Real judges know that their decisions affect defendants' lives. They may also be concerned about reelection or promotion. Hence a lot is at stake. In study 3 we emulate high stakes by giving participants a financial incentive. An incentive to find the ground truth, or to avoid false positive or false negatives, does not make participants more sensitive to machine advice. But an incentive to follow the advice is effective.

[1] A. Tversky,et al. Prospect theory: an analysis of decision under risk — Source link , 2007 .

[2] Elissa M. Redmiles,et al. A Summary of Survey Methodology Best Practices for Security and Privacy Researchers , 2017 .

[3] Andrew D. Selbst,et al. Big Data's Disparate Impact , 2016 .

[4] Ernst Fehr,et al. The Intrinsic Value of Decision Rights , 2013, SSRN Electronic Journal.

[5] Alexandra Chouldechova,et al. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[6] Andreas Glöckner,et al. Defendant Should Have the Last Word – Experimentally Manipulating Order and Provisional Assessment of the Facts in Criminal Procedure , 2017 .

[7] B. Latané,et al. Bystander intervention in emergencies: diffusion of responsibility. , 1968, Journal of personality and social psychology.

[8] Mary E. Thomson,et al. The relative influence of advice from human experts and statistical methods on forecast adjustments , 2009 .

[9] David C. Parkes,et al. How Do Fairness Definitions Fare?: Examining Public Attitudes Towards Algorithmic Definitions of Fairness , 2018, AIES.

[10] Krishna P. Gummadi,et al. Beyond Distributive Fairness in Algorithmic Decision Making: Feature Selection for Procedurally Fair Learning , 2018, AAAI.

[11] Toniann Pitassi,et al. Learning Fair Representations , 2013, ICML.

[12] Kirk A. Randazzo,et al. Strategic Anticipation and the Hierarchy of Justice in U.S. District Courts , 2008 .

[13] Björn Bartling,et al. Shifting the Blame: On Delegation and Responsibility , 2011 .

[14] Z. Kunda,et al. The case for motivated reasoning. , 1990, Psychological bulletin.

[15] Michael Carl Tschantz,et al. Exploring User Perceptions of Discrimination in Online Targeted Advertising , 2017, USENIX Security Symposium.

[16] Joanna M. Shepherd,et al. Measuring Maximizing Judges: Empirical Legal Studies, Public Choice Theory, and Judicial Behavior , 2011 .

[17] Panagiotis G. Ipeirotis. Demographics of Mechanical Turk , 2010 .

[18] S. Bonaccio,et al. Advice taking and decision-making: An integrative literature review, and implications for the organizational sciences , 2006 .

[19] Sonja B. Starr,et al. Ban the Box, Criminal Records, and Statistical Discrimination: A Field Experiment , 2016 .

[20] Mandeep K. Dhami,et al. Instructions on reasonable doubt: Defining the standard of proof and the juror’s task. , 2015 .

[21] Iyad Rahwan,et al. The social dilemma of autonomous vehicles , 2015, Science.

[22] Iyad Rahwan,et al. Cooperating with machines , 2017, Nature Communications.

[23] Krishna P. Gummadi,et al. The Case for Process Fairness in Learning: Feature Selection for Fair Decision Making , 2016 .

[24] Suresh Venkatasubramanian,et al. On the (im)possibility of fairness , 2016, ArXiv.

[25] Oliver Kirchkamp,et al. Sharing responsibility with a machine , 2018, Journal of Behavioral and Experimental Economics.

[26] Christopher T. Lowenkamp,et al. False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[27] Emma Howarth,et al. Judging Risk , 2012, Journal of interpersonal violence.

[28] Robert D. Cooter,et al. The objectives of private and public judges , 1983 .

[29] J. C. R. Licklider,et al. Man-Computer Symbiosis , 1960 .

[30] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[31] D. Bem,et al. DIFFUSION OF RESPONSIBILITY AND LEVEL OF RISK TAKING IN GROUPS. , 1963, Journal of abnormal psychology.

[32] Franco Turini,et al. Discrimination-aware data mining , 2008, KDD.

[33] Thomas J. Miceli,et al. Reputation and judicial decision-making , 1994 .

[34] William F. Shughart,et al. On the Incentives of Judges to Enforce Legislative Wealth Transfers , 1989, The Journal of Law and Economics.

[35] Urs Fischbacher,et al. Shifting the Blame: On Delegation and Responsibility , 2011 .

[36] T. Rex. A practical guide to the American Community Survey (5-year estimates) , 2010 .

[37] Married,et al. Classification with no discrimination by preferential sampling , 2010 .

[38] R. Hogarth,et al. Providing information for decision making: Contrasting description and simulation , 2015 .

[39] Sharad Goel,et al. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[40] Mark A. Cohen,et al. The motives of judges: Empirical evidence from antitrust sentencing , 1992 .

[41] Shi Feng,et al. What can AI do for me?: evaluating machine learning interpretations in cooperative play , 2019, IUI.

[42] Angèle Christin. Algorithms in practice: Comparing web journalism and criminal justice , 2017 .

[43] Janet A. Sniezek,et al. Cueing and Cognitive Conflict in Judge-Advisor Decision Making , 1995 .

[44] Krishna P. Gummadi,et al. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[45] A. Acquisti,et al. Beyond the Turk: Alternative Platforms for Crowdsourcing Behavioral Research , 2016 .

[46] Krishna P. Gummadi,et al. A Unified Approach to Quantifying Algorithmic Unfairness: Measuring Individual &Group Unfairness via Inequality Indices , 2018, KDD.

[47] A. Tversky,et al. The framing of decisions and the psychology of choice. , 1981, Science.

[48] Franco Turini,et al. k-NN as an implementation of situation testing for discrimination discovery and prevention , 2011, KDD.

[49] M. Kearns,et al. Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[50] David M. Bersoff. Why Good People Sometimes Do Bad Things: Motivated Reasoning and Unethical Behavior , 1999 .

[51] Kori Inkpen Quinn,et al. Investigating Human + Machine Complementarity for Recidivism Predictions , 2018, ArXiv.

[52] Christoph Engel,et al. You Are in Charge: Experimentally Testing the Motivating Power of Holding a Judicial Office , 2017, The Journal of Legal Studies.

[53] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.

[54] Carlos Eduardo Scheidegger,et al. Certifying and Removing Disparate Impact , 2014, KDD.

[55] Avi Feller,et al. Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[56] Yang Liu,et al. Calibrated Fairness in Bandits , 2017, ArXiv.

[57] Hany Farid,et al. The accuracy, fairness, and limits of predicting recidivism , 2018, Science Advances.

[58] Stefan Palan,et al. Prolific.ac—A subject pool for online experiments , 2017 .

[59] Krishna P. Gummadi,et al. Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction , 2018, WWW.

[60] C. Ai,et al. Interaction terms in logit and probit models , 2003 .

[61] Paul Brace,et al. Judicial Choice and the Politics of Abortion: Institutions, Context, and the Autonomy of Courts , 1999 .

[62] Jure Leskovec,et al. Human Decisions and Machine Predictions , 2017, The quarterly journal of economics.

[63] Krishna P. Gummadi,et al. Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[64] Ben Green,et al. Disparate Interactions: An Algorithm-in-the-Loop Analysis of Fairness in Risk Assessments , 2019, FAT.

[65] Jon M. Kleinberg,et al. Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[66] Christos Dimitrakakis,et al. Multi-View Decision Processes: The Helper-AI Problem , 2017, NIPS.

[67] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.

[68] Panagiotis G. Ipeirotis,et al. Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[69] Ilan Yaniv,et al. Spurious consensus and opinion revision: why might people be more confident in their less accurate judgments? , 2009, Journal of experimental psychology. Learning, memory, and cognition.

[70] J. Henrich,et al. The Moral Machine experiment , 2018, Nature.

[71] Andreas Krause,et al. Mathematical Notions vs. Human Perception of Fairness: A Descriptive Approach to Fairness for Machine Learning , 2019, KDD.

[72] G. Gigerenzer,et al. Simple tools for understanding risks: from innumeracy to insight , 2003, BMJ : British Medical Journal.

[73] John Monahan,et al. Judging the Use of Risk Assessment in Sentencing , 2019 .