OntoNotes: A Unified Relational Semantic Representation

The OntoNotes project is creating a corpus of large-scale, accurate, and integrated annotation of multiple levels of the shallow semantic structure in text. Such rich, integrated annotation covering many levels will allow for richer, cross-level models enabling significantly better automatic semantic analysis. At the same time, it demands a robust, efficient, scalable mechanism for storing and accessing these complex inter-dependent annotations. We describe a relational database representation that captures both the inter- and intra-layer dependencies and provide details of an object-oriented API for efficient, multi-tiered access to this data.

[1]  Charles J. Fillmore,et al.  The Structure of the Framenet Database , 2003 .

[2]  Katrin Erk,et al.  A Powerful and Versatile XML Format for Representing Role-semantic Annotation , 2004, LREC.

[3]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[4]  Sameer Pradhan,et al.  Unrestricted Coreference: Identifying Entities and Events in OntoNotes , 2007, International Conference on Semantic Computing (ICSC 2007).

[5]  Ralph Grishman,et al.  Covering Treebanks with GLARF , 2001, ACL 2001.

[6]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[7]  Nancy Ide,et al.  International Standard for a Linguistic Annotation Framework , 2003, Natural Language Engineering.

[8]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[9]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[10]  Olga Babko-Malaya,et al.  Different Sense Granularities for Different Applications , 2004, HLT-NAACL 2004.

[11]  Seth Kulick,et al.  Issues in Synchronizing the English Treebank and PropBank , 2006 .

[12]  Chung-Hsien Wu,et al.  OntoNotes: Sense Pool Verification Using Google N-gram and Statistical Tests , 2007 .

[13]  Thilo Götz,et al.  Design and implementation of the UIMA Common Analysis System , 2004, IBM Syst. J..

[14]  Patrick Pantel,et al.  The Omega Ontology , 2005, IJCNLP.

[15]  James Pustejovsky,et al.  Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank and Coreference , 2005, FCA@ACL.

[16]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[17]  B. Hladká,et al.  The Prague Dependency Treebank: Annotation Structure and Support , 2022 .