论文信息 - Crowd translator: on building localized speech recognizers through micropayments

Crowd translator: on building localized speech recognizers through micropayments

We present a method to expand the number of languages covered by simple speech recognizers. Enabling speech recognition in users' primary languages greatly extends the types of mobile-phone-based applications available to people in developing regions. We describe how we expand language corpora through user-supplied speech contributions, how we quickly evaluate each contribution, and how we pay contributors for their work.

[1] Kristin Precoda,et al. Speech translation for low-resource languages: the case of Pashto , 2005, INTERSPEECH.

[2] Ivana Kruijff-Korbayová,et al. Annotation Guidelines for Czech-English Word Alignment , 2006, LREC.

[3] Jonathan Ledlie,et al. Organic Indoor Location Discovery , 2008 .

[4] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[5] Luis F. G. Sarmenta,et al. Sabotage-tolerance mechanisms for volunteer computing systems , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[6] Aniket Kittur,et al. Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[7] James R. Glass,et al. Telephone data collection using the World Wide Web , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8] Nathan Eagle,et al. txteagle: Mobile Crowdsourcing , 2009, HCI.

[9] Petri Haavisto,et al. Name dialing-how useful is it? , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[10] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[11] Panagiotis G. Ipeirotis,et al. Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[12] David P. Anderson,et al. SETI@home: an experiment in public-resource computing , 2002, CACM.

[13] Kristin Precoda,et al. Speech Recognition Engineering Issues in Speech to Speech Translation System Design for Low Resource Languages and Domains , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[14] R. Cole,et al. TELEPHONE SPEECH CORPUS DEVELOPMENT AT CSLU , 1998 .

[15] John J. Godfrey,et al. Macrophone: an American English telephone speech corpus for the Polyphone project , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[16] Jaime G. Carbonell,et al. Efficiently learning the accuracy of labeling sources for selective sampling , 2009, KDD.