Validating and Extending Semantic Knowledge Bases using Video Games with a Purpose

Large-scale knowledge bases are important assets in NLP. Frequently, such resources are constructed through automatic mergers of complementary resources, such as WordNet and Wikipedia. However, manually validating these resources is prohibitively expensive, even when using methods such as crowdsourcing. We propose a cost-effective method of validating and extending knowledge bases using video games with a purpose. Two video games were created to validate conceptconcept and concept-image relations. In experiments comparing with crowdsourcing, we show that video game-based validation consistently leads to higher-quality annotations, even when players are not

[1]  Fabian M. Suchanek,et al.  Yago: A Core of Semantic Knowledge Unifying WordNet and Wikipedia , 2007 .

[2]  Regina Barzilay,et al.  Paraphrasing for Automatic Evaluation , 2006, NAACL.

[3]  Mathieu Lafourcade,et al.  Computing trees of named word usages from a crowdsourced lexical network , 2010, IMCSIT.

[4]  Siddharth Suri,et al.  Conducting behavioral research on Amazon’s Mechanical Turk , 2010, Behavior research methods.

[5]  Tiziano Flati,et al.  Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project , 2014, ACL.

[6]  Simone Paolo Ponzetto,et al.  Joining Forces Pays Off: Multilingual Joint Word Sense Disambiguation , 2012, EMNLP.

[7]  Jane Yung-jen Hsu,et al.  Community-based game design: experiments on social games for commonsense data collection , 2009, HCOMP '09.

[8]  Simone Paolo Ponzetto,et al.  Collaboratively built semi-structured content and Artificial Intelligence: The story so far , 2013, Artif. Intell..

[9]  Simone Paolo Ponzetto,et al.  BabelNet: Building a Very Large Multilingual Semantic Network , 2010, ACL.

[10]  Markus Dickinson,et al.  Using semi-experts to derive judgments on word sense alignment: a pilot study , 2012, LREC.

[11]  Johan Bos,et al.  Gamification for Word Sense Labeling , 2013, IWCS.

[12]  Marco Baroni,et al.  Bootstrapping a Game with a Purpose for Commonsense Collection , 2012, TIST.

[13]  Mountaz Hascoët,et al.  Multiscale Visual Analysis of Lexical Networks , 2009, 2009 13th International Conference Information Visualisation.

[14]  Adrien Treuille,et al.  Predicting protein structures with a multiplayer online game , 2010, Nature.

[15]  Jakub Simko,et al.  Little search game: term network acquisition via a human computation game , 2011, HT '11.

[16]  Luis von Ahn,et al.  Word sense disambiguation via human computation , 2010, HCOMP '10.

[17]  Aniket Kittur,et al.  An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets , 2011, ICWSM.

[18]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[19]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[20]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  David Vickrey,et al.  Online Word Games for Semantic Data Collection , 2008, EMNLP.

[22]  Julian Szymanski,et al.  Bringing Common Sense to WordNet with a Word Game , 2013, ICCCI.

[23]  Roberto Navigli,et al.  Semi-Automatic Extension of Large-Scale Linguistic Knowledge Bases , 2005, FLAIRS.

[24]  Martin Hepp,et al.  Games with a Purpose for the Semantic Web , 2008, IEEE Intelligent Systems.

[25]  Yolanda Gil,et al.  Improving the design of intelligent acquisition interfaces for collecting world knowledge from web contributors , 2005, K-CAP '05.

[26]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[27]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[28]  Chris Biemann,et al.  Crowdsourcing WordNet , 2009 .

[29]  Ian H. Witten,et al.  Mining Meaning from Wikipedia , 2008, Int. J. Hum. Comput. Stud..

[30]  Daniel Jurafsky,et al.  Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[31]  Elena Paslaru Bontas Simperl,et al.  SpotTheLink: A Game for Ontology Alignment , 2011, Wissensmanagement.

[32]  Roberto Navigli,et al.  Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity , 2013, ACL.

[33]  Manuel Blum,et al.  Verbosity: a game for collecting common-sense facts , 2006, CHI.

[34]  Rada Mihalcea,et al.  Building a Sense Tagged Corpus with Open Mind Word Expert , 2002, SENSEVAL.

[35]  Udo Kruschwitz,et al.  Phrase detectives: Utilizing collective intelligence for internet-scale language resource creation , 2013, TIIS.

[36]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[37]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[38]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[39]  Iryna Gurevych,et al.  The People’s Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet , 2011, IWCS.

[40]  Elena Paslaru Bontas Simperl,et al.  CrowdMap: Crowdsourcing Ontology Alignment with Microtasks , 2012, SEMWEB.

[41]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[42]  Martin Hepp,et al.  OntoGame: Weaving the Semantic Web by Online Games , 2008, ESWC.

[43]  Jirí Mírovský,et al.  Play the Language: Play Coreference , 2009, ACL.

[44]  Tadeusz M. Szuba,et al.  Computational Collective Intelligence , 2001, Lecture Notes in Computer Science.