Scalable Spatial Crowdsourcing: A Study of Distributed Algorithms

Recently spatial crowd sourcing was introduced as a natural extension to traditional crowd sourcing allowing for tasks to have a geospatial component, i.e., A task can only be performed if a worker is physically present at the location of the task. The problem of assigning spatial tasks to workers in a spatial crowd sourcing system can be formulated as a weighted bipartite b-matching graph problem that can be solved optimally by existing methods for the minimum cost maximum flow problem. However, these methods are still too complex to run repeatedly for an online system, especially when the number of incoming workers and tasks increases. Hence, we propose a class of approaches that utilizes an online partitioning method to reduce the problem space across a set of cloud servers to construct independent bipartite graphs and solve the assignment problem in parallel. Our approaches solve the spatial task assignment approximately but competitive to the exact solution. We experimentally verify that our approximate approaches outperform the centralized and Map Reduce version of the exact approach with acceptable accuracy and thus suitable for online spatial crowd sourcing at scale.

[1]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[2]  Kamiel Cornelissen,et al.  Smoothed Analysis of the Successive Shortest Path Algorithm , 2013, SIAM J. Comput..

[3]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Vana Kalogeraki,et al.  On Task Assignment for Real-Time Reliable Crowdsourcing , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[6]  Yuval Rabani,et al.  Linear Programming , 2007, Handbook of Approximation Algorithms and Metaheuristics.

[7]  Alireza Sahami Shirazi,et al.  Location-based crowdsourcing: extending crowdsourcing to the real world , 2010, NordiCHI.

[8]  Aranyak Mehta,et al.  Online Matching and Ad Allocation , 2013, Found. Trends Theor. Comput. Sci..

[9]  Chin-Laung Lei,et al.  A crowdsourceable QoE evaluation framework for multimedia content , 2009, ACM Multimedia.

[10]  Aristides Gionis,et al.  Social Content Matching in MapReduce , 2011, Proc. VLDB Endow..

[11]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[12]  Ugur Demiryurek,et al.  Maximizing the number of worker's self-selected tasks in spatial crowdsourcing , 2013, SIGSPATIAL/GIS.

[13]  Roland H. C. Yap,et al.  A MapReduce-Based Maximum-Flow Algorithm for Large Small-World Network Graphs , 2011, 2011 31st International Conference on Distributed Computing Systems.

[14]  Murat Demirbas,et al.  Crowdsourcing location-based queries , 2011, 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops).

[15]  Kevin M. Passino,et al.  Distributed Task Assignment for Mobile Agents , 2007, IEEE Transactions on Automatic Control.

[16]  Lilly Irani,et al.  Amazon Mechanical Turk , 2018, Advances in Intelligent Systems and Computing.

[17]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[18]  D. R. Fulkerson,et al.  Maximal Flow Through a Network , 1956 .

[19]  Aranyak Mehta,et al.  Online Matching with Stochastic Rewards , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[20]  Kyriakos Mouratidis,et al.  Capacity constrained assignment in spatial databases , 2008, SIGMOD Conference.

[21]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[22]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[23]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[24]  Raymond Chi-Wing Wong,et al.  On Efficient Spatial Matching , 2007, VLDB.

[25]  Mariangiola Dezani-Ciancaglini,et al.  A Discrimination Algorithm Inside lambda-beta-Calculus , 1979, Theor. Comput. Sci..

[26]  Lei Chen,et al.  GeoTruCrowd: trustworthy query answering with spatial crowdsourcing , 2013, SIGSPATIAL/GIS.

[27]  Cyrus Shahabi,et al.  GeoCrowd: enabling query answering with spatial crowdsourcing , 2012, SIGSPATIAL/GIS.

[28]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[29]  Chien-Ju Ho,et al.  Online Task Assignment in Crowdsourcing Markets , 2012, AAAI.

[30]  Dimitri P. Bertsekas,et al.  Parallel synchronous and asynchronous implementations of the auction algorithm , 1991, Parallel Comput..

[31]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .