Crowdsourcing Transcription Beyond Mechanical Turk

While much work has studied crowdsourced transcription via Amazon’s Mechanical Turk, we are not familiar with any prior cross-platform analysis of crowdsourcing service providers for transcription. We present a qualitative and quantitative analysis of eight such providers: 1-888-Type-It-Up, 3Play Media, Transcription Hub, CastingWords, Rev, TranscribeMe, Quicktate, and SpeakerText. We also provide comparative evaluation vs. three transcribers from oDesk. Spontanteous speech used in our experiments is drawn from USC-SFI MALACH collection of oral history interviews. After informally evaluating pilot transcripts from all providers, our formal evaluation measures word error rate (WER) over 10-minute segments from six interviews transcribed by three service providers and the three oDesk transcribers. We report the WER obtained in each case, and more generally assess tradeoffs among the quality, cost, risk and effort of alternative crowd-based transcription options.

[1]  Panayiotis G. Georgiou,et al.  Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Matthew Lease,et al.  Beyond AMT: An Analysis of Crowd Work Platforms , 2013, ArXiv.

[3]  Alexander I. Rudnicky,et al.  Using the Amazon Mechanical Turk for transcription of spoken language , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  James R. Glass,et al.  A Transcription Task for Crowdsourcing with Automatic Quality Control , 2011, INTERSPEECH.

[5]  Douglas A. Reynolds,et al.  Measuring the readability of automatic speech-to-text transcripts , 2003, INTERSPEECH.

[6]  James R. Williams,et al.  Guidelines for the Use of Multimedia in Instruction , 1998 .

[7]  Maxine Eskénazi,et al.  Toward better crowdsourced transcription: Transcription of a year of the Let's Go Bus Information System data , 2010, 2010 IEEE Spoken Language Technology Workshop.

[8]  Eytan Adar Why I Hate Mechanical Turk Research (and Workshops) , 2011 .

[9]  Alexander I. Rudnicky,et al.  Using the Amazon Mechanical Turk to Transcribe and Annotate Meeting Speech for Extractive Summarization , 2010, Mturk@HLT-NAACL.

[10]  Jason D. Williams,et al.  Crowd-sourcing for difficult transcription of speech , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[11]  Bhuvana Ramabhadran,et al.  Automatic recognition of spontaneous speech for access to multilingual oral history archives , 2004, IEEE Transactions on Speech and Audio Processing.

[12]  Ian McGraw,et al.  A self-transcribing speech corpus: collecting continuous speech with an online educational game , 2009, SLaTE.

[13]  Chris Callison-Burch,et al.  Cheap, Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription , 2010, NAACL.

[14]  Walter S. Lasecki,et al.  Real-time captioning by groups of non-experts , 2012, UIST.

[15]  Matthew Lease,et al.  Recognizing disfluencies in conversational speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Ryen W. White,et al.  Overview of the CLEF-2006 Cross-Language Speech Retrieval Track , 2006, CLEF.

[17]  Michael S. Bernstein,et al.  The future of crowd work , 2013, CSCW.

[18]  Klaus Zechner,et al.  Using Amazon Mechanical Turk for Transcription of Non-Native Speech , 2010, Mturk@HLT-NAACL.