An attractive game with the document: (im)possible?

An attractive game with the document: (im)possible? The annotation experience we have acquired while participating in the Prague treebanking projects provides us with a strong evidence to conclude that the linguistic data annotation by experts is a very intensive and expensive process. No surprise that we care whether we can get the annotated data in a less demanding process. We focus on an alternative way of annotation to generate the data for natural language processing tasks that either have not been implemented yet or have been implemented with a performance lower than human performance. To be more specific, we are interested in ways of annotation gathered mostly under the terms ‘crowdsourcing’ and ‘human computation’, i.e. we concentrate on activities that motivate as many non-experts as possible to devote whatever they prefer (effort, time, enthusiasm, responsibility, etc.) to carry out annotation. In this paper, we review the notion of crowdsourcing, namely we turn our attention to crowdsourcing projects that manipulate textual data. As we are delighted with the games with a purpose, we carry out an implementation of the on-line games with texts. We introduce a game on coreference, PlayCoref, and games with words and white spaces in the sentence, Shannon Game and Place the Space, in great details. The game rules are designed to be language independent and the games are playable with both Czech and English texts by default. After a number of sessions played so far we revise our initial expectations and enthusiasm to design an attractive annotation game with a document.

[1]  Alexandra Birch,et al.  Metrics for MT evaluation: evaluating reordering , 2010, Machine Translation.

[2]  Holger Schwenk,et al.  Optimising Multiple Metrics with MERT , 2011, Prague Bull. Math. Linguistics.

[3]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[4]  Philip Resnik,et al.  Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[5]  Suresh Venkatasubramanian,et al.  Streaming for large scale NLP: Language Modeling , 2009, NAACL.

[6]  José B. Mariño,et al.  Morpho-syntactic Information for Automatic Error Analysis of Statistical Machine Translation Output , 2006, WMT@HLT-NAACL.

[7]  Andy Way,et al.  Experiments on Domain Adaptation for Patent Machine Translation in the PLuTO project , 2011, EAMT.

[8]  Ondrej Bojar,et al.  CzEng 0.9: Large Parallel Treebank with Rich Annotation , 2009, Prague Bull. Math. Linguistics.

[9]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[10]  Taro Watanabe,et al.  Online Large-Margin Training for Statistical Machine Translation , 2007, EMNLP.

[11]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[12]  Lluís Màrquez i Villodre,et al.  Towards Heterogeneous Automatic MT Error Analysis , 2008, LREC.

[13]  José B. Mariño,et al.  N-gram-based Machine Translation , 2006, CL.

[14]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[15]  Daniel Jurafsky,et al.  The Best Lexical Metric for Phrase-Based Statistical MT System Optimization , 2010, NAACL.

[16]  Joel D. Martin,et al.  Improving Translation Quality by Discarding Most of the Phrasetable , 2007, EMNLP.

[17]  Holger Schwenk,et al.  LIUM’s SMT Machine Translation Systems for WMT 2011 , 2012, WMT@NAACL-HLT.

[18]  Alon Lavie,et al.  Extending the METEOR Machine Translation Evaluation Metric to the Phrase Level , 2010, NAACL.

[19]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[20]  Chris Callison-Burch,et al.  Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[21]  Ben Taskar,et al.  Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[22]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[23]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[24]  Hermann Ney,et al.  Word Error Rates: Decomposition over POS classes and Applications for Error Analysis , 2007, WMT@ACL.

[25]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[26]  Sara Stymne,et al.  Blast: A Tool for Error Analysis of Machine Translation Output , 2011, ACL.

[27]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[28]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[29]  Holger Schwenk,et al.  Investigations on large-scale lightly-supervised training for statistical machine translation. , 2008, IWSLT.

[30]  Nitin Madnani,et al.  Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric , 2009, WMT@EACL.

[31]  Hal Daumé,et al.  Domain Adaptation for Machine Translation by Mining Unseen Words , 2011, ACL.

[32]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[33]  Min-Yen Kan,et al.  Perspectives on crowdsourcing annotations for natural language processing , 2012, Language Resources and Evaluation.

[34]  Christopher D. Manning,et al.  Phrasal: a toolkit for statistical machine translation with facilities for extraction and incorporation of arbitrary model features , 2010, HLT-NAACL 2010.

[35]  Hermann Ney,et al.  Towards Automatic Error Analysis of Machine Translation Output , 2011, CL.

[36]  Yannick Versley,et al.  BART: A Modular Toolkit for Coreference Resolution , 2008, ACL.

[37]  Philipp Koehn,et al.  Further Meta-Evaluation of Machine Translation , 2008, WMT@ACL.

[38]  Philipp Koehn,et al.  Experiments in Domain Adaptation for Statistical Machine Translation , 2007, WMT@ACL.

[39]  Barry Haddow,et al.  Improved Minimum Error Rate Training in Moses , 2009, Prague Bull. Math. Linguistics.

[40]  Gholamreza Haffari,et al.  Transductive learning for statistical machine translation , 2007, ACL.

[41]  Marcello Federico,et al.  Domain Adaptation for Statistical Machine Translation with Monolingual Resources , 2009, WMT@EACL.

[42]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[43]  José B. Mariño,et al.  Improving a Catalan-Spanish Statistical Translation System using Morphosyntactic Knowledge , 2009, EAMT.

[44]  Rudolf Rosa,et al.  Two-step translation with grammatical post-processing , 2011, WMT@EMNLP.

[45]  Alon Y. Halevy,et al.  Crowdsourcing systems on the World-Wide Web , 2011, Commun. ACM.

[46]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[47]  François Yvon,et al.  Improving Reordering with Linguistically Informed Bilingual n-grams , 2010, COLING.

[48]  Stephan Vogel,et al.  Language Model Adaptation for Statistical Machine Translation via Structured Query Models , 2004, COLING.

[49]  Stefan Riezler,et al.  On Some Pitfalls in Automatic Evaluation and Significance Testing for MT , 2005, IEEvaluation@ACL.

[50]  Christoph Tillmann,et al.  A Unigram Orientation Model for Statistical Machine Translation , 2004, NAACL.

[51]  Richard M. Schwartz,et al.  Language and Translation Model Adaptation using Comparable Corpora , 2008, EMNLP.

[52]  Udo Kruschwitz,et al.  Phrase Detectives: A Web-based collaborative annotation game , 2008 .

[53]  Stefan Riezler,et al.  Multi-Task Minimum Error Rate Training for SMT , 2011, Prague Bull. Math. Linguistics.

[54]  Nagiza F. Samatova,et al.  PackPlay: Mining Semantic Data in Collaborative Games , 2010, Linguistic Annotation Workshop.

[55]  Alexandra Birch,et al.  Reordering Metrics for MT , 2011, ACL.

[56]  M. Utiyama,et al.  A Japanese-English patent parallel corpus , 2007, MTSUMMIT.

[57]  Jirí Mírovský,et al.  Designing a Language Game for Collecting Coreference Annotation , 2009, Linguistic Annotation Workshop.

[58]  Luis von Ahn,et al.  Word sense disambiguation via human computation , 2010, HCOMP '10.

[59]  Chunyu Kit,et al.  The Parameter-Optimized ATEC Metric for MT Evaluation , 2010, WMT@ACL.

[60]  Jan Hajic,et al.  The Czech Academic Corpus 2.0 Guide , 2008, Prague Bull. Math. Linguistics.

[61]  Kevin Knight,et al.  11,001 New Features for Statistical Machine Translation , 2009, NAACL.

[62]  Alon Lavie,et al.  Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[63]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[64]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[65]  Christian Hardmeier,et al.  Fast and Extensible Phrase Scoring for Statistical Machine Translation , 2010, Prague Bull. Math. Linguistics.

[66]  Nathan Schneider,et al.  Association for Computational Linguistics: Human Language Technologies , 2011 .

[67]  Panagiotis G. Ipeirotis,et al.  Managing crowdsourced human computation: a tutorial , 2011, WWW.

[68]  Ondrej Bojar,et al.  Analyzing Error Types in English-Czech Machine Translation , 2011, Prague Bull. Math. Linguistics.

[69]  Roland Kuhn,et al.  Mixture-Model Adaptation for SMT , 2007, WMT@ACL.

[70]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[71]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.

[72]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[73]  Philipp Koehn,et al.  Online learning methods for discriminative training of phrase based statistical machine translation , 2007, MTSUMMIT.

[74]  Aljoscha Burchardt,et al.  From Human to Automatic Error Classification for Machine Translation Output , 2011, EAMT.

[75]  Miles Osborne,et al.  Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.

[76]  Petr Pajas,et al.  TectoMT: Highly Modular MT System with Tectogrammatics Used as Transfer Layer , 2008, WMT@ACL.

[77]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[78]  Raman Chandrasekar,et al.  Improving search engines using human computation games , 2009, CIKM.

[79]  Koby Crammer,et al.  Multi-domain learning by confidence-weighted parameter combination , 2010, Machine Learning.

[80]  Hermann Ney,et al.  Automatic Evaluation Measures for Statistical Machine Translation System Optimization , 2008, LREC.

[81]  Elena Paslaru Bontas Simperl,et al.  Incentives, Motivation, Participation, Games: Human Computation for Linked Data , 2010, LDSI@FIA.

[82]  Ya Zhang,et al.  Boosted multi-task learning , 2010, Machine Learning.

[83]  Philipp Koehn,et al.  Margin Infused Relaxed Algorithm for Moses , 2011, Prague Bull. Math. Linguistics.

[84]  Petr Sgall,et al.  The Meaning Of The Sentence In Its Semantic And Pragmatic Aspects , 1986 .

[85]  Michal Novák Machine Learning Approach to Anaphora Resolution , 2010 .

[86]  Trevor Darrell,et al.  An Efficient Projection for l 1 , ∞ Regularization , 2009 .

[87]  Jirí Mírovský,et al.  Play the Language: Play Coreference , 2009, ACL.

[88]  Maja Popovic Hjerson: An Open Source Tool for Automatic Error Classification of Machine Translation Output , 2011, Prague Bull. Math. Linguistics.

[89]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[90]  Francisco Casacuberta,et al.  Machine Translation with Inferred Stochastic Finite-State Transducers , 2004, Computational Linguistics.

[91]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[92]  Irwin King,et al.  A Survey of Human Computation Systems , 2009, 2009 International Conference on Computational Science and Engineering.

[93]  Ondrej Bojar,et al.  Addicter: What Is Wrong with My Translations? , 2011, Prague Bull. Math. Linguistics.

[94]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[95]  Andrzej Stachurski,et al.  Parallel Optimization: Theory, Algorithms and Applications , 2000, Parallel Distributed Comput. Pract..

[96]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[97]  Gideon S. Mann,et al.  Distributed Training Strategies for the Structured Perceptron , 2010, NAACL.

[98]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[99]  Claire Cardie,et al.  Coreference Resolution with Reconcile , 2010, ACL.

[100]  Christopher D. Manning,et al.  Hierarchical Bayesian Domain Adaptation , 2009, NAACL.

[101]  Chris Callison-Burch,et al.  Stream-based Translation Models for Statistical Machine Translation , 2010, NAACL.

[102]  Anna Nedoluzhko,et al.  Extended Coreferential Relations and Bridging Anaphora in the Prague Dependency Treebank , .

[103]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[104]  Andy Way,et al.  PLuTO: MT for online patent translation , 2010 .

[105]  Raman Chandrasekar,et al.  Page hunt: improving search engines using human computation games , 2009, SIGIR.

[106]  R. Spritz A study in scarlet , 1995, Nature Genetics.

[107]  Sophia Ananiadou,et al.  Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty , 2009, ACL.