CrowdSight: Rapidly Prototyping Intelligent Visual Processing Apps

We describe a framework for rapidly prototyping applications which require intelligent visual processing, but for which reliable algorithms do not yet exist, or for which engineering those algorithms is too costly. The framework, CrowdSight, leverages the power of crowdsourcing to offload intelligent processing to humans, and enables new applications to be built quickly and cheaply, affording system builders the opportunity to validate a concept before committing significant time or capital. Our service accepts requests from users either via email or simple mobile applications, and handles all the communication with a backend human computation platform.We build redundant requests and data aggregation into the system freeing the user from managing these requirements. We validate our framework by building several test applications and verifying that prototypes can be built more easily and quickly than would be the case without the framework.

[1]  Cheng-Hsin Hsu,et al.  Building book inventories using smartphones , 2010, ACM Multimedia.

[2]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[4]  H. Saiga,et al.  An OCR system for business cards , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[5]  Ming-Hsuan Yang,et al.  The HPU , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[6]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[7]  Jeffrey P. Bigham,et al.  VizWiz: nearly real-time answers to visual questions , 2010, W4A.

[8]  Konrad Tollmar,et al.  A picture is worth a thousand keywords: image-based object search on a mobile platform , 2005, CHI Extended Abstracts.

[9]  Xilin Chen,et al.  Detection of text on road signs from video , 2005, IEEE Trans. Intell. Transp. Syst..

[10]  Lydia B. Chilton,et al.  TurKit: Tools for iterative tasks on mechanical turk , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[11]  Deepak Ganesan,et al.  mCrowd: a platform for mobile crowdsourcing , 2009, SenSys '09.

[12]  Benjamin B. Bederson,et al.  Human Computation : Charting The Growth Of A Burgeoning Field , 2010 .

[13]  Bernd Girod,et al.  Rate-efficient, real-time cd cover recognition on a camera-phone , 2008, ACM Multimedia.

[14]  Leonidas J. Guibas,et al.  Counting people in crowds with a real-time network of simple image sensors , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Aditya G. Parameswaran,et al.  Answering Queries using Humans, Algorithms and Databases , 2011, CIDR.

[16]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Jacob A. Hyman Computer Vision Based People Tracking for Motivating Behavior in Public Spaces , 2003 .