Multilingual Supervision of Semantic Annotation

In this paper, we investigate the annotation projection of semantic units in a practical setting. Previous approaches have focused on using parallel corpora for semantic transfer. We evaluate an alternative approach using loosely parallel corpora that does not require the corpora to be exact translations of each other. We developed a method that transfers semantic annotations from one language to another using sentences aligned by entities, and we extended it to include alignments by entity-like linguistic units. We conducted our experiments on a large scale using the English, Swedish, and French language editions of Wikipedia. Our results show that the annotationprojectionusingentitiesincombinationwithlooselyparallelcorporaprovidesaviable approach to extending previous attempts. In addition, it allows the generation of proposition banks upon which semantic parsers can be trained. (Less)

[1]  Robert Östling,et al.  Stagger: an Open-Source Part of Speech Tagger for Swedish , 2013 .

[2]  Richard Johansson,et al.  Dependency-based Syntactic–Semantic Analysis with PropBank and NomBank , 2008, CoNLL.

[3]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[4]  Pierre Nugues,et al.  A Distant Supervision Approach to Semantic Role Labeling , 2015, *SEMEVAL.

[5]  Lonneke van der Plas,et al.  Scaling up Automatic Cross-Lingual Semantic Role Annotation , 2011, ACL.

[6]  Joakim Nivre,et al.  A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing , 2012, EMNLP.

[7]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[8]  Laure Vieu,et al.  Corpus Annotation within the French FrameNet: a Domain-by-domain Methodology , 2016, LREC.

[9]  Pascal Denis,et al.  Statistical French Dependency Parsing: Treebank Conversion and First Results , 2010, LREC.

[10]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[11]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[12]  Yunyao Li,et al.  Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling , 2015, ACL.

[13]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[14]  Pierre Nugues,et al.  A High-Performance Syntactic and Semantic Dependency Parser , 2010, COLING.

[15]  Naomie Salim,et al.  A framework for multi-document abstractive summarization based on semantic role labelling , 2015, Appl. Soft Comput..

[16]  Pierre Nugues,et al.  Multilingual Semantic Role Labeling , 2009, CoNLL Shared Task.

[17]  Markus Forsberg,et al.  The Past Meets the Present in Swedish FrameNet , 2010 .

[18]  Jonas Kuhn,et al.  The Best of Both Worlds – A Graph-based Completion Model for Transition-based Parsers , 2012, EACL.

[19]  Benoît Sagot,et al.  Developing a French FrameNet: Methodology and First results , 2014, LREC.

[20]  Oren Etzioni,et al.  Semantic Role Labeling for Open Information Extraction , 2010, HLT-NAACL 2010.

[21]  Daniel S. Weld,et al.  Learning 5000 Relational Extractors , 2010, ACL.

[22]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[23]  Martha Palmer,et al.  Optimization of natural language processing components for robustness and scalability , 2012 .

[24]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[25]  Robert Dale,et al.  United Nations General Assembly Resolutions : a six-language parallel corpus , 2009 .

[26]  Roberto Basili,et al.  Cross-Language Frame Semantics Transfer in Bilingual Corpora , 2009, CICLing.

[27]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.

[28]  Mirella Lapata,et al.  Cross-lingual Annotation Projection for Semantic Roles , 2009, J. Artif. Intell. Res..

[29]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.