Evidence Humans Provide When Explaining Data-Labeling Decisions

Because machine learning would benefit from reduced data requirements, some prior work has proposed using humans not just to label data, but also to explain those labels. To characterize the evidence humans might want to provide, we conducted a user study and a data experiment. In the user study, 75 participants provided classification labels for 20 photos, justifying those labels with free-text explanations. Explanations frequently referenced concepts (objects and attributes) in the image, yet 26% of explanations invoked concepts not in the image. Boolean logic was common in implicit form, but was rarely explicit. In a follow-up experiment on the Visual Genome dataset, we found that some concepts could be partially defined through their relationship to frequently co-occurring concepts, rather than only through labeling.

[1]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[2]  Ece Kamar,et al.  Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets , 2017, CHI.

[3]  Maya Cakmak,et al.  Power to the People: The Role of Humans in Interactive Machine Learning , 2014, AI Mag..

[4]  George Saon,et al.  The IBM 2015 English conversational telephone speech recognition system , 2015, INTERSPEECH.

[5]  Weng-Keen Wong,et al.  Principles of Explanatory Debugging to Personalize Interactive Machine Learning , 2015, IUI.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Michael D. Buhrmester,et al.  Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[8]  Trevor Darrell,et al.  Attentive Explanations: Justifying Decisions and Pointing to the Evidence , 2016, ArXiv.

[9]  Todd Kulesza,et al.  Structured labeling for facilitating concept evolution in machine learning , 2014, CHI.

[10]  George Saon,et al.  The IBM 2016 English Conversational Telephone Speech Recognition System , 2016, INTERSPEECH.

[11]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[12]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ashish Kapoor,et al.  FeatureInsight: Visual support for error-driven feature ideation in text classification , 2015, 2015 IEEE Conference on Visual Analytics Science and Technology (VAST).

[14]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[15]  Qun Liu,et al.  Deep Neural Machine Translation with Linear Associative Unit , 2017, ACL.

[16]  David Maxwell Chickering,et al.  Machine Teaching: A New Paradigm for Building Machine Learning Systems , 2017, ArXiv.

[17]  Thomas G. Dietterich,et al.  Toward harnessing user feedback for machine learning , 2007, IUI '07.

[18]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[19]  Christopher Ré,et al.  Training Classifiers with Natural Language Explanations , 2018, ACL.

[20]  Desney S. Tan,et al.  Effective End-User Interaction with Machine Learning , 2011, AAAI.

[21]  Thomas G. Dietterich,et al.  Interacting meaningfully with machine learning systems: Three experiments , 2009, Int. J. Hum. Comput. Stud..

[22]  Gierad Laput,et al.  Zensors: Adaptive, Rapidly Deployable, Human-Intelligent Sensor Feeds , 2015, CHI.