Towards Building a High-Quality Workforce with Mechanical

Online crowdsourcing services provide an inexpensive and scalable platform for large-scale information verification tasks. We present our experiences using Amazon’s Mechanical Turk (AMT) sto verify over 100,000 local business listings for an online directory. We compare the performance of AMT workers to that of experts across five different types of tasks and find that most workers do not contribute high-quality work. We present the results of preliminary experiments that work towards filtering low-quality workers and increasing overall workforce accuracy. Finally, we directly compare workers’ accuracy on business categorization tasks against a Naı̈ve Bayes classifier trained from user-contributed business reviews and find that the classifier outperforms our workforce. Our report aims to inform the community of empirical results and cost constraints that are critical to understanding the problem of quality control in crowdsourcing systems.