Construction of a Turkish proposition bank

This paper describes our approach to developing the Turkish PropBank by adopting the semantic role-labeling guidelines of the original PropBank and using the translation of the English Penn-TreeBank as a resource. We discuss the semantic annotation process of the PropBank and language-specific cases for Turkish, the tools we have developed for annotation, and quality control for multiuser annotation. In the current phase of the project, more than 9500 sentences are semantically analyzed and predicate-argument information is extracted for 1330 verbs and 1914 verb senses. Our plan is to annotate 17,000 sentences by the end of 2017.

[1]  Lonneke van der Plas,et al.  Cross-Lingual Validity of PropBank in the Manual Annotation of French , 2010, Linguistic Annotation Workshop.

[2]  Katrin Erk,et al.  SALTO - A Versatile Multi-Level Annotation Tool , 2006, LREC.

[3]  Kôiti Hasida,et al.  Construction of a Japanese Relevance-tagged Corpus , 2002, LREC.

[4]  Martha Palmer,et al.  Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee , 2010, LREC.

[5]  Martha Palmer,et al.  PropBank: the Next Level of TreeBank , 2003 .

[6]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[7]  Olcay Taner Yildiz,et al.  Constructing a Turkish-English Parallel TreeBank , 2014, ACL.

[8]  Martha Palmer,et al.  Propbank Frameset Annotation Guidelines Using a Dedicated Editor, Cornerstone , 2010, LREC.

[9]  Mariona Taulé,et al.  AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.

[10]  Gözde Gül Sahin Verb Sense Annotation for Turkish PropBank via Crowdsourcing , 2016, CICLing.

[11]  Smruthi Mukund,et al.  Using Cross-Lingual Projections to Generate Semantic Role Labeled Corpus for Urdu-A Resource Poor Language , 2010 .

[12]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[13]  Sandra M. Aluísio,et al.  Propbank-Br: a Brazilian Treebank annotated with semantic role labels , 2012, LREC.

[14]  Ann Bies,et al.  A Pilot Arabic Propbank , 2008, LREC.

[15]  Olcay Taner Yildiz,et al.  Constructing a Turkish Constituency Parse TreeBank , 2015, ISCIS.

[16]  Veronika Laippala,et al.  The Finnish Proposition Bank , 2015, Language Resources and Evaluation.

[17]  Lluís Màrquez i Villodre,et al.  SemEval-2007 Task 09: Multilevel Semantic Annotation of Catalan and Spanish , 2007, SemEval@ACL.

[18]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[19]  Martha Palmer,et al.  The Revised Arabic PropBank , 2010, Linguistic Annotation Workshop.

[20]  Nianwen Xue,et al.  Adding semantic roles to the Chinese Treebank , 2009, Natural Language Engineering.

[21]  Martha Palmer,et al.  Analysis of the Hindi Proposition Bank using Dependency Structure , 2011, Linguistic Annotation Workshop.