Complementing the Execution of AI Systems with Human Computation

For a multitude of tasks that come naturally to humans, performance of AI systems is inferior to human level performance. We show how human intellect made available via crowdsourcing can be used to complement an existing system during execution. We introduce a hybrid workflow that queries people to verify and correct the output of the system and present a simulation-based workflow optimization method to balance the cost of human input with the expected improvement in performance. Through empirical evaluations on an image captioning system, we show that the hybrid system, which combines the AI system with human input, significantly outperforms the automated system by properly trading off the cost of human input with expected benefit. Finally, we show that human input collected at execution time can be used to teach the system about its errors and limitations.

[1]  Krzysztof Z. Gajos,et al.  Human computation tasks with global constraints , 2012, CHI.

[2]  Michael S. Bernstein,et al.  Crowds in two seconds: enabling realtime crowd-powered interfaces , 2011, UIST.

[3]  Eric Horvitz,et al.  Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.

[4]  Eric Horvitz,et al.  Lifelong Learning for Acquiring the Wisdom of the Crowd , 2013, IJCAI.

[5]  Vikas Kumar,et al.  CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones , 2010, MobiSys '10.

[6]  C. Lawrence Zitnick,et al.  Finding the weakest link in person detectors , 2011, CVPR 2011.

[7]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[8]  Hyun-Chul Kim,et al.  Bayesian Classifier Combination , 2012, AISTATS.

[9]  Omar Alonso,et al.  Crowdsourcing for relevance evaluation , 2008, SIGF.

[10]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[11]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[12]  Geoffrey Zweig,et al.  From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Gierad Laput,et al.  Zensors: Adaptive, Rapidly Deployable, Human-Intelligent Sensor Feeds , 2015, CHI.

[14]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[15]  Walter S. Lasecki,et al.  Architecting Real-Time Crowd-Powered Systems , 2014, Hum. Comput..

[16]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[17]  Mausam,et al.  Towards a Language for Non-Expert Specification of POMDPs for Crowdsourcing , 2013, HCOMP.

[18]  Walter S. Lasecki,et al.  Answering visual questions with conversational crowd assistants , 2013, ASSETS.

[19]  Peng Dai,et al.  POMDP-based control of workflows for crowdsourcing , 2013, Artif. Intell..

[20]  Dafna Shahaf,et al.  Generalized Task Markets for Human and Machine Computation , 2010, AAAI.

[21]  Eric Horvitz,et al.  Complementary computing: policies for transferring callers from dialog systems to human receptionists , 2006, User Modeling and User-Adapted Interaction.

[22]  Fei-Fei Li,et al.  Best of both worlds: Human-machine collaboration for object annotation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[24]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[25]  Aniket Kittur,et al.  CrowdForge: crowdsourcing complex work , 2011, UIST.

[26]  Aniket Kittur,et al.  Alloy: Clustering with Crowds and Computation , 2016, CHI.

[27]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.