论文信息 - Accounting for Confirmation Bias in Crowdsourced Label Aggregation

Accounting for Confirmation Bias in Crowdsourced Label Aggregation

Collecting large-scale human-annotated datasets via crowdsourcing to train and improve automated models is a prominent human-in-the-loop approach to integrate human and machine intelligence. However, together with their unique intelligence, humans also come with their biases and subjective beliefs, which may influence the quality of the annotated data and negatively impact the effectiveness of the human-in-the-loop systems. One of the most common types of cognitive biases that humans are subject to is the confirmation bias, which is people’s tendency to favor information that confirms their existing beliefs and values. In this paper, we present an algorithmic approach to infer the correct answers of tasks by aggregating the annotations from multiple crowd workers, while taking workers’ various levels of confirmation bias into consideration. Evaluations on real-world crowd annotations show that the proposed bias-aware label aggregation algorithm outperforms baseline methods in accurately inferring the ground-truth labels of different tasks when crowd workers indeed exhibit some degree of confirmation bias. Through simulations on synthetic data, we further identify the conditions when the proposed algorithm has the largest advantages over baseline methods.

Ming Yin | Meric Altug Gemalmaz | Ming Yin

[1] Anna L. Cox,et al. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems , 2019, CHI.

[2] Luca Cardelli,et al. The World Wide Web Conference , 2019, WWW.

[3] Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining , 2018, WSDM.

[4] Haipei Sun,et al. Towards Fair Truth Discovery from Biased Crowdsourced Answers , 2020, KDD.

[5] Jian Peng,et al. Variational Inference for Crowdsourcing , 2012, NIPS.

[6] Jahna Otterbacher,et al. How Do We Talk about Other People? Group (Un)Fairness in Natural Language Image Descriptions , 2019, HCOMP.

[7] Stefano Mizzaro,et al. Crowdsourcing Truthfulness: The Impact of Judgment Scale and Assessor Bias , 2020, ECIR.

[8] Besnik Fetahu,et al. Understanding and Mitigating Worker Biases in the Crowdsourced Collection of Subjective Judgments , 2019, CHI.

[9] Proceedings of The Web Conference 2020 , 2020 .

[10] Arpita Biswas,et al. The Role of In-Group Bias and Balanced Data: A Comparison of Human and Machine Recidivism Risk Predictions , 2020, COMPASS.

[11] Chengqi Zhang,et al. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , 2015, KDD.