ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations

This paper presents ANCOR-AS, an enriched version of the ANCOR corpus. This version adds syntactic annotations in addition to the existing coreference and speech transcription ones. This corpus is also released in a new TEI-compliant XML format.

[1]  Eric Gaussier,et al.  Annotating a large corpus with anaphoric links , 2000 .

[2]  Marie Mikulová,et al.  Coreference in Prague Czech-English Dependency Treebank , 2016, LREC.

[3]  Renata Vieira,et al.  Summ-it++: an Enriched Version of the Summ-it Corpus , 2016, LREC.

[4]  Serge Heiden,et al.  Interoperable annotation of (co)references in the Democrat project , 2017, ACL 2017.

[5]  C. M. Sperberg-McQueen,et al.  Guidelines for electronic text encoding and interchange , 1994 .

[6]  Denis Maurel,et al.  ANCOR_Centre, a large free spoken French coreference corpus: description of the resource and reliability measures , 2014, LREC.

[7]  Jerry R. Hobbs Resolving pronoun references , 1986 .

[8]  Mitchell P. Marcus,et al.  OntoNotes: A Unified Relational Semantic Representation , 2007, International Conference on Semantic Computing (ICSC 2007).

[9]  Benoît Sagot,et al.  The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy , 2017, CoNLL Shared Task.

[10]  Nianwen Xue,et al.  CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes , 2011, CoNLL Shared Task.

[11]  Mariona Taulé,et al.  AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.

[12]  Mark Steedman,et al.  The NXT-format Switchboard Corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue , 2010, Lang. Resour. Evaluation.

[13]  Nizar Habash,et al.  CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2017, CoNLL.

[14]  Isabelle Tellier,et al.  Apports des analyses syntaxiques pour la détection automatique de mentions dans un corpus de français oral (Experiences in using deep and shallow parsing to detect entity mentions in oral French) , 2017, TALN.

[15]  Anne Lacheret,et al.  Rhapsodie: a Prosodic-Syntactic Treebank for Spoken French , 2014, LREC.

[16]  Olivier Baude,et al.  (Re)faire le corpus d’Orléans quarante ans après :quoi de neuf, linguiste ? , 2011 .

[17]  Yann Mathet,et al.  The Glozz platform: a corpus annotation and mining tool , 2012, DocEng '12.

[18]  J. Debaisieux,et al.  Le projet ORFÉO : un corpus d’étude pour le français contemporain , 2016 .

[19]  Xabier Arregi,et al.  Mention detection: First steps in the development of a Basque coreference resolution system , 2012, KONVENS.

[20]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[21]  Jean-Yves Antoine,et al.  Temporal@ODIL project: Adapting ISO-TimeML to syntactic treebanks for the temporal annotation of spoken speech , 2017, ACL 2017.

[22]  Yuchen Zhang,et al.  CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.