CrowdMap: Crowdsourcing Ontology Alignment with Microtasks

The last decade of research in ontology alignment has brought a variety of computational techniques to discover correspondences between ontologies. While the accuracy of automatic approaches has continuously improved, human contributions remain a key ingredient of the process: this input serves as a valuable source of domain knowledge that is used to train the algorithms and to validate and augment automatically computed alignments. In this paper, we introduce CrowdMap, a model to acquire such human contributions via microtask crowdsourcing. For a given pair of ontologies, CrowdMap translates the alignment problem into microtasks that address individual alignment questions, publishes the microtasks on an online labor market, and evaluates the quality of the results obtained from the crowd. We evaluated the current implementation of CrowdMap in a series of experiments using ontologies and reference alignments from the Ontology Alignment Evaluation Initiative and the crowdsourcing platform CrowdFlower. The experiments clearly demonstrated that the overall approach is feasible, and can improve the accuracy of existing ontology alignment solutions in a fast, scalable, and cost-effective manner.

[1]  Jérôme David,et al.  The Alignment API 4.0 , 2011, Semantic Web.

[2]  Martin Gaedke,et al.  Discovering and Maintaining Links on the Web of Data , 2009, SEMWEB.

[3]  Jeff Z. Pan,et al.  The Semanic Web: Research and Applications - 8th Extended Semantic Web Conference, ESWC 2011, Heraklion, Crete, Greece, May 29 - June 2, 2011, Proceedings, Part II , 2011, ESWC.

[4]  Elena Paslaru Bontas Simperl,et al.  SpotTheLink: A Game for Ontology Alignment , 2011, Wissensmanagement.

[5]  AnHai Doan,et al.  Matching Schemas in Online Communities: A Web 2.0 Approach , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[6]  Margaret-Anne D. Storey,et al.  A Cognitive Support Framework for Ontology Mapping , 2007, ISWC/ASWC.

[7]  Johanna Völker,et al.  GuessWhat?! Human Intelligence for Mining Linked Data , 2010 .

[8]  Mark A. Musen,et al.  Collecting Community-Based Mappings in an Ontology Repository , 2008, SEMWEB.

[9]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[10]  Pavel Shvaiko,et al.  Community-Driven Ontology Matching , 2006, ESWC.

[11]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[12]  Björn Hartmann,et al.  Turkomatic: automatic recursive task and workflow design for mechanical turk , 2011, Human Computation.

[13]  Lydia B. Chilton,et al.  TurKit: Tools for iterative tasks on mechanical turk , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[14]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[15]  Feng Shi,et al.  Actively Learning Ontology Matching via User Interaction , 2009, SEMWEB.

[16]  Michael S. Bernstein,et al.  Analytic Methods for Optimizing Realtime Crowdsourcing , 2012, ArXiv.

[17]  Mark A. Musen,et al.  The PROMPT suite: interactive tools for ontology merging and mapping , 2003, Int. J. Hum. Comput. Stud..

[18]  Raphaël Troncy,et al.  Interlinking Multimedia: How to Apply Linked Data Principles to Multimedia Fragments , 2009, LDOW.

[19]  Lukas Biewald,et al.  Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing , 2011, Human Computation.

[20]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[21]  Elena Paslaru Bontas Simperl,et al.  SeaFish: A Game for Collaborative and Visual Image Annotation and Interlinking , 2011, ESWC.

[22]  Abraham Bernstein,et al.  The Semantic Web - ISWC 2009, 8th International Semantic Web Conference, ISWC 2009, Chantilly, VA, USA, October 25-29, 2009. Proceedings , 2009, SEMWEB.

[23]  Steffen Staab,et al.  The Semantic Web - ISWC 2008, 7th International Semantic Web Conference, ISWC 2008, Karlsruhe, Germany, October 26-30, 2008. Proceedings , 2008, SEMWEB.

[24]  Panagiotis G. Ipeirotis,et al.  Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.

[25]  Elena Paslaru Bontas Simperl,et al.  SpotTheLink: playful alignment of ontologies , 2011, SAC '11.

[26]  Gianluca Demartini,et al.  ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking , 2012, WWW.