CrowdSearch: exploiting crowds for accurate real-time image search on mobile phones

Mobile phones are becoming increasingly sophisticated with a rich set of on-board sensors and ubiquitous wireless connectivity. However, the ability to fully exploit the sensing capabilities on mobile phones is stymied by limitations in multimedia processing techniques. For example, search using cellphone images often encounters high error rate due to low image quality. In this paper, we present CrowdSearch, an accurate image search system for mobile phones. CrowdSearch combines automated image search with real-time human validation of search results. Automated image search is performed using a combination of local processing on mobile phones and backend processing on remote servers. Human validation is performed using Amazon Mechanical Turk, where tens of thousands of people are actively working on simple tasks for monetary rewards. Image search with human validation presents a complex set of tradeoffs involving energy, delay, accuracy, and monetary cost. CrowdSearch addresses these challenges using a novel predictive algorithm that determines which results need to be validated, and when and how to validate them. CrowdSearch is implemented on Apple iPhones and Linux servers. We show that CrowdSearch achieves over 95% precision across multiple image categories, provides responses within minutes, and costs only a few cents.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[3]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[4]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[5]  Michael Isard,et al.  General Theory , 1969 .

[6]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  R. Manmatha,et al.  Distributed image search in camera sensor networks , 2008, SenSys '08.

[8]  Wei Pan,et al.  SoundSense: scalable sound sensing for people-centric applications on mobile phones , 2009, MobiSys '09.

[9]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[10]  Romit Roy Choudhury,et al.  SurroundSense: mobile phone localization via ambience fingerprinting , 2009, MobiCom '09.

[11]  Kun Li,et al.  iScope: personalized multi-modality image search for mobile devices , 2009, MobiSys '09.

[12]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[13]  Emiliano Miluzzo,et al.  MetroSense Project: People-Centric Sensing at Scale , 2006 .

[14]  Emiliano Miluzzo,et al.  People-centric urban sensing , 2006, WICON '06.

[15]  Alec Wolman,et al.  MAUI: making smartphones last longer with code offload , 2010, MobiSys '10.

[16]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[17]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.