Annotations with EARMARK for arbitrary, overlapping and out-of order markup

In this paper we propose a novel approach to markup, called Extreme Annotational RDF Markup (EARMARK), using RDF and OWL to annotate features in text content that cannot be mapped with usual markup languages. EARMARK provides a unifying framework to handle tree-based XML features as well as more complex markup for non-XML scenarios such as overlapping elements, repeated and non-contiguous ranges and structured attributes. EARMARK includes and expands the principles of XML markup, RDFa inline annotations and existing approaches to overlapping markup such as LMNL and TexMecs. EARMARK documents can also be linearized into plain XML by choosing any of a number of strategies to express a tree-based subset of the annotations as an XML structure and fitting in the remaining annotations through a number of "tricks", markup expedients for hierarchical linearization of non-hierarchical features. EARMARK provides a solid platform for providing vocabulary-independent declarative support to advanced document features such as transclusion, overlapping and out-of-order annotations within a conceptually insensitive environment such as XML, and does so by exploiting recent semantic web concepts and languages.

[1]  Nelson,et al.  Literary machines : the report on, and of, project Xanadu concerning word processing, electronic publishing, hypertext, thinkertoys, tomorrow's intellectual revolution, and certain other topics including knowledge, education and freedom , 1981 .

[2]  Steven J. DeRose,et al.  XML Pointer Language (XPointer) Version 1. 0. World Wide Web Consortium, Working Draft WD - xptr - 2 , 2001 .

[3]  C. M. Sperberg-McQueen,et al.  Guidelines for electronic text encoding and interchange : TEI P4 , 2002 .

[4]  Yves Marcoux Graph characterization of overlap-only TexMECS and other overlapping markup formalisms , 2008 .

[5]  Jiri Kosek Gentle Introduction to XML , 2000 .

[6]  Wendell Piez,et al.  The Layered Markup and Annotation Language (LMNL) , 2002, Extreme Markup Languages®.

[7]  C. M. Sperberg-McQueen,et al.  GODDAG: A Data Structure for Overlapping Hierarchies , 2000, DDEP/PODDP.

[8]  Ian Horrocks,et al.  OWL Web Ontology Language Reference-W3C Recommen-dation , 2004 .

[9]  Steven J. DeRose,et al.  Markup Overlap: A Review and a Horse , 2004, Extreme Markup Languages®.

[10]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[11]  C. M. Sperberg-McQueen,et al.  Guidelines for electronic text encoding and interchange , 1994 .

[12]  Fabio Vitali,et al.  Towards the unification of formats for overlapping markup , 2008, New Rev. Hypermedia Multim..

[13]  Theodor Holm Nelson,et al.  Embedded Markup Considered Harmful , 1997, World Wide Web J..

[14]  D. M. Ha,et al.  A gentle introduction , 2006 .

[15]  Charles F. Goldfarb,et al.  SGML handbook , 1990 .

[16]  Mirina Grosz,et al.  World Wide Web Consortium , 2010 .

[17]  L. Stein,et al.  OWL Web Ontology Language - Reference , 2004 .

[18]  Steven Pemberton,et al.  RDFa in XHTML: Syntax and Processing , 2008 .