The Annotation Scheme of the Turkish Discourse Bank and an Evaluation of Inconsistent Annotations

In this paper, we report on the annotation procedures we developed for annotating the Turkish Discourse Bank (TDB), an effort that extends the Penn Discourse Tree Bank (PDTB) annotation style by using it for annotating Turkish discourse. After a brief introduction to the TDB, we describe the annotation cycle and the annotation scheme we developed, defining which parts of the scheme are an extension of the PDTB and which parts are different. We provide inter-coder reliability calculations on the first and second arguments of some connectives and discuss the most important sources of disagreement among annotators.

[1]  Deniz Zeyrek,et al.  Pair Annotation: Adaption of Pair Programming to Corpus Annotation , 2012, LAW@ACL.

[2]  Rashmi Prasad,et al.  Annotating Discourse Connectives and Their Arguments , 2004, FCP@NAACL-HLT.

[3]  Bonnie L. Webber,et al.  D-LTAG: extending lexicalized TAG to discourse , 2004, Cogn. Sci..

[4]  Massimo Poesio,et al.  Annotating a Corpus to Develop and Evaluate Discourse Entity Realization Algorithms: Issues and Preliminary Results , 2000, LREC.

[5]  Matthew Stone,et al.  Anaphora and Discourse Structure , 2001, CL.

[6]  B. Webber,et al.  A Short Introduction to the Penn Discourse TreeBank , 2005 .

[7]  Nicholas Asher,et al.  Reference to abstract objects in discourse , 1993, Studies in linguistics and philosophy.

[8]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[9]  Ruken Cakici,et al.  Annotating Subordinators in the Turkish Discourse Bank , 2009, Linguistic Annotation Workshop.

[10]  Deniz Zeyrek,et al.  Context, contrast, and the structure of discourse in Turkish , 2011 .

[11]  Laurie A. Williams,et al.  All I really need to know about pair programming I learned in kindergarten , 2000, Commun. ACM.

[12]  Ted Sanders,et al.  The Role of Coherence Relations and Their Linguistic Markers in Text Processing , 2000 .

[13]  Bonnie L. Webber,et al.  A Discourse Resource for Turkish: Annotating Discourse Connectives in the METU Corpus , 2008, IJCNLP.

[14]  Ani Nenkova,et al.  Easily Identifiable Discourse Relations , 2008, COLING.

[15]  Laurie A. Williams,et al.  Strengthening the Case for Pair Programming , 2000, IEEE Softw..

[16]  Livio Robaldo,et al.  The Penn Discourse Treebank 2.0 Annotation Manual , 2007 .

[17]  Barbara Di Eugenio,et al.  Squibs and Discussions: The Kappa Statistic: A Second Look , 2004, CL.