Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists

Annotating rich audio data is an essential aspect of training and evaluating machine listening systems. We approach this task in the context of temporally-complex urban soundscapes, which require multiple labels to identify overlapping sound sources. Typically this work is crowdsourced, and previous studies have shown that workers can quickly label audio with binary annotation for single classes. However, this approach can be difficult to scale when multiple passes with different focus classes are required to annotate data with multiple labels. In citizen science, where tasks are often image-based, annotation efforts typically label multiple classes simultaneously in a single pass. This paper describes our data collection on the Zooniverse citizen science platform, comparing the efficiencies of different audio annotation strategies. We compared multiple-pass binary annotation, single-pass multi-label annotation, and a hybrid approach: hierarchical multi-pass multi-label annotation. We discuss our findings, which support using multi-label annotation, with reference to volunteer citizen scientists' motivations.

[1]  Paul Roe,et al.  Rapid Scanning of Spectrograms for Efficient Identification of Bioacoustic Events in Big Data , 2013, 2013 IEEE 9th International Conference on e-Science.

[2]  Kevin Crowston,et al.  Gaming for (Citizen) Science: Exploring Motivation and Data Quality in the Context of Crowdsourced Science through the Design and Evaluation of a Social-Computational System , 2011, 2011 IEEE Seventh International Conference on e-Science Workshops.

[3]  Jason T. Reed,et al.  An Exploratory Factor Analysis of Motivations for Participating in Zooniverse, a Collection of Virtual Citizen Science Projects , 2013, 2013 46th Hawaii International Conference on System Sciences.

[4]  John M. Carroll,et al.  Evaluation, Description and Invention: Paradigms for Human-Computer Interaction , 1989, Adv. Comput..

[5]  Jennifer Preece,et al.  Dynamic changes in motivation in collaborative citizen-science projects , 2012, CSCW.

[6]  C. Lintott,et al.  Galaxy Zoo: Exploring the Motivations of Citizen Science Volunteers. , 2009, 0909.2925.

[7]  Lior Shamir,et al.  Classification of large acoustic datasets using machine learning and crowdsourcing: application to whale calls. , 2014, The Journal of the Acoustical Society of America.

[8]  Oded Nov,et al.  Seeing Sound , 2017, Proc. ACM Hum. Comput. Interact..

[9]  Brian McFee,et al.  OpenMIC-2018: An Open Data-set for Multiple Instrument Recognition , 2018, ISMIR.

[10]  Dana Chandler,et al.  Breaking Monotony with Meaning: Motivation in Crowdsourcing Markets , 2012, ArXiv.

[11]  Thomas Hillman,et al.  The epistemic culture in an online citizen science project: Programs, antiprograms and epistemic subjects , 2018, Social studies of science.

[12]  Carsten S. Østerlund,et al.  "Guess what! You're the First to See this Event": Increasing Contribution to Online Production Communities , 2016, GROUP.

[13]  Michael S. Bernstein,et al.  Embracing Error to Enable Rapid Crowdsourcing , 2016, CHI.

[14]  Jennifer Preece,et al.  Using gamification to inspire new citizen science volunteers , 2013, Gamification.

[15]  Deva Ramanan,et al.  Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.

[16]  Oded Nov,et al.  Dusting for science: motivation and participation of digital citizen science volunteers , 2011, iConference.

[17]  Eric Horvitz,et al.  Volunteering Versus Work for Pay: Incentives and Tradeoffs in Crowdsourcing , 2013, HCOMP.

[18]  Richard L. Neitzel,et al.  Environmental Noise Pollution in the United States: Developing an Effective Public Health Response , 2013, Environmental health perspectives.

[19]  Charlene Jennett,et al.  Do games attract or sustain engagement in citizen science?: a study of volunteer motivations , 2013, CHI Extended Abstracts.

[20]  David P. Anderson,et al.  Scientists@Home: What Drives the Quantity and Quality of Online Citizen Science Participation? , 2014, PloS one.

[21]  Jennifer Shirk,et al.  The Invisible Prevalence of Citizen Science in Global Research: Migratory Birds and Climate Change , 2014, PloS one.

[22]  A. Cox,et al.  Motivations, learning and creativity in online citizen science , 2016 .

[23]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[24]  Ben Shneiderman,et al.  The future of interactive systems and the emergence of direct manipulation , 1982 .

[25]  S. Janssen,et al.  Auditory and non-auditory effects of noise on health , 2014, The Lancet.

[26]  C. Lintott,et al.  Galaxy Zoo: Motivations of Citizen Scientists , 2008, 1303.6886.

[27]  Aren Jansen,et al.  Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Oded Nov,et al.  SONYC , 2019, Commun. ACM.

[29]  C. Potter,et al.  Citizen science as seen by scientists: Methodological, epistemological and ethical dimensions , 2014, Public understanding of science.

[30]  Heikki Huttunen,et al.  Multi-label vs. combined single-label sound event detection with deep neural networks , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[31]  Elena Paslaru Bontas Simperl,et al.  Designing for Citizen Data Analysis: A Cross-Sectional Case Study of a Multi-Domain Citizen Science Platform , 2015, CHI.

[32]  Ali Farhadi,et al.  Much Ado About Time: Exhaustive Annotation of Temporal Data , 2016, HCOMP.

[33]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[34]  C. Lintott,et al.  Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey , 2008, 0804.4483.

[35]  B. Berglund,et al.  A principal components model of soundscape perception. , 2010, The Journal of the Acoustical Society of America.

[36]  Ann Blandford,et al.  Designing for dabblers and deterring drop-outs in citizen science , 2014, CHI.

[37]  Aniket Kittur,et al.  An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets , 2011, ICWSM.

[38]  Giovanni Quattrone,et al.  Analysing Volunteer Engagement in Humanitarian Mapping: Building Contributor Communities at Large Scale , 2016, CSCW.

[39]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Oded Nov,et al.  SONYC: A System for the Monitoring, Analysis and Mitigation of Urban Noise Pollution , 2018, ArXiv.

[41]  Lydia B. Chilton,et al.  Cascade: crowdsourcing taxonomy creation , 2013, CHI.

[42]  Michael S. Bernstein,et al.  Scalable multi-label annotation , 2014, CHI.

[43]  Brian L. Sullivan,et al.  eBird: Engaging Birders in Science and Conservation , 2011, PLoS biology.

[44]  Carsten S. Østerlund,et al.  Motivations for Sustained Participation in Crowdsourcing: Case Studies of Citizen Science on the Role of Talk , 2015, 2015 48th Hawaii International Conference on System Sciences.

[45]  Mark A. Girolami,et al.  Bat detective—Deep learning tools for bat acoustic signal detection , 2017, bioRxiv.

[46]  Margret C. Domroese,et al.  Why watch bees? Motivations of citizen science volunteers in the Great Pollinator Project , 2017 .

[47]  C. Lintott,et al.  Snapshot Serengeti, high-frequency annotated camera trap images of 40 mammalian species in an African savanna , 2015, Scientific Data.

[48]  Kevin Crowston,et al.  The future of citizen science: emerging technologies and shifting paradigms , 2012, Frontiers in Ecology and the Environment.

[49]  Aniket Kittur,et al.  CrowdForge: crowdsourcing complex work , 2011, UIST.

[50]  Aren Jansen,et al.  Large-scale audio event discovery in one million YouTube videos , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[51]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[52]  Birgitta Berglund,et al.  The Swedish soundscape-quality protocol , 2012 .

[53]  Shuicheng Yan,et al.  Efficient large-scale image annotation by probabilistic collaborative multi-label propagation , 2010, ACM Multimedia.

[54]  Mausam,et al.  Crowdsourcing Multi-Label Classification for Taxonomy Creation , 2013, HCOMP.

[55]  James D. Hollan,et al.  Direct Manipulation Interfaces , 1985, Hum. Comput. Interact..

[56]  Daniel J. Veit,et al.  More than fun and money. Worker Motivation in Crowdsourcing - A Study on Mechanical Turk , 2011, AMCIS.