Modelling Linguistic Annotations

This chapter describes how linguistic annotations can be represented in RDF. Web Annotation and NIF provide the means to reference text segments on the web. Yet, representing linguistic annotations requires appropriate vocabularies. We discuss relevant vocabularies and illustrate how they can be applied to support annotation at different levels.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Nancy Ide,et al.  GrAF: A Graph-based Format for Linguistic Annotations , 2007, LAW@ACL.

[3]  Stefanie Dipper,et al.  ANNIS: Complex Multilevel Annotations in a Linguistic Database , 2006, NLPXML@EACL.

[4]  Christian Chiarcos,et al.  Graph-Based Annotation Engineering: Towards a Gold Corpus for Role and Reference Grammar , 2019, LDK.

[5]  James Pustejovsky,et al.  LAPPS/Galaxy: Current State and Next Steps , 2016, WLSI/OIAF4HLT@COLING.

[6]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[7]  Christian Chiarcos,et al.  lemonUby - A large, interlinked, syntactically-rich lexical resource for ontologies , 2015, Semantic Web.

[8]  Carlo Aliprandi,et al.  KAF: a Generic Semantic Annotation Format , 2009 .

[9]  Silvia Hansen,et al.  Developments in the TIGER Annotation Scheme and their Realization in the Corpus , 2002, LREC.

[10]  Mark Liberman,et al.  A formal framework for linguistic annotation , 1999, Speech Commun..

[11]  Jonas Kuhn,et al.  The Best of Both Worlds – A Graph-based Completion Model for Transition-based Parsers , 2012, EACL.

[12]  Christian Chiarcos,et al.  A Flexible Framework for Integrating Annotations from Different Tools and Tagsets , 2008 .

[13]  Jens Lehmann,et al.  Integrating NLP Using Linked Data , 2013, SEMWEB.

[14]  Oliver Christ,et al.  A Modular and Flexible Architecture for an Integrated Corpus Query System , 1994, ArXiv.

[15]  Ulf Leser,et al.  graphANNIS: A Fast Query Engine for Deeply Annotated Linguistic Corpora , 2016, J. Lang. Technol. Comput. Linguistics.

[16]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[17]  Jorge L. V. Barbosa,et al.  Ontology-based information extraction for juridical events with case studies in Brazilian legal realm , 2017, Artificial Intelligence and Law.

[18]  Nancy Ide,et al.  The Linguistic Annotation Framework: a standard for annotation interchange and merging , 2014, Lang. Resour. Evaluation.

[19]  Christian Chiarcos,et al.  ANNIS: A Search Tool for Multi-Layer Annotated Corpora , 2009 .

[20]  Ulrich Heid,et al.  Formalising Multi-layer Corpora in OWL DL - Lexicon Modelling, Querying and Consistency Control , 2008, IJCNLP.

[21]  Christian Chiarcos,et al.  POWLA: Modeling Linguistic Corpora in OWL/DL , 2012, ESWC.

[22]  Nancy Ide,et al.  International Standard for a Linguistic Annotation Framework , 2003, Natural Language Engineering.

[23]  Adil El Ghali,et al.  TELIX: An RDF-Based Model for Linguistic Annotation , 2012, ESWC.

[24]  Christian Chiarcos,et al.  Designing annotation schemes: from model to representation , 2017 .

[25]  Andrew Frank,et al.  Building Literary Corpora for Computational Literary Analysis - A Prototype to Bridge the Gap between CL and DH , 2018, LREC.

[26]  Stephan Oepen,et al.  Semantic Technologies for Querying Linguistic Annotations: An Experiment Focusing on Graph-Structured Data , 2014, LREC.

[27]  Adam Kilgarriff,et al.  The Sketch Engine: ten years on , 2014 .

[28]  Manfred Stede,et al.  SUMMaR: Combining Linguistics and Statistics for Text Summarization , 2006, ECAI.

[29]  Nils Diewald,et al.  KorAP: the new corpus analysis platform at IDS Mannheim , 2013 .

[30]  Christian Chiarcos,et al.  Interoperability of Corpora and Annotations , 2012, Linked Data in Linguistics.

[31]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[32]  Christian Chiarcos,et al.  CoNLL-RDF: Linked Corpora Done in an NLP-Friendly Way , 2017, LDK.

[33]  Nicolas Mazziotta,et al.  Building the Syntactic Reference Corpus of Medieval French Using NotaBene RDF Annotation Tool , 2010, Linguistic Annotation Workshop.

[34]  Christian Chiarcos,et al.  Analyzing Middle High German Syntax with RDF and SPARQL , 2018, LREC.

[35]  Steve Cassidy An RDF realisation of LAF in the DADA annotation server , 2010, ACL 2010.

[36]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[37]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[38]  Laurent Romary,et al.  A model oriented approach to the mapping of annotation formats using standards , 2010 .

[39]  Catherine Lai,et al.  Querying and Updating Treebanks: A Critical Survey and Requirements Analysis , 2004, ALTA.

[40]  Stefanie Dipper,et al.  Accessing Heterogeneous Linguistic Data — Generic XML-based Representation and Flexible Visualization , 2004 .

[41]  Christiane Fellbaum,et al.  MASC: the Manually Annotated Sub-Corpus of American English , 2008, LREC.