Encoding and Querying Multi-Structured Documents

This paper concerns the document multi-structuring issue. For various use objectives, many distinct structures may be defined simultaneously for the same original document. For example, a document may have a first structure for logical content organisation (logical structure), and a second structure to express a set of content formatting rules (physical structure). We have already proposed a generic model, called MSDM, for the multistructured documents, in which several important features were established. In this paper, we address the encoding problem of this kind of documents. We present a new formalism, called MultiX, which allows encoding the multi-structured documents efficiently. This formalism is based on the MSDM model and uses XML syntax.

[1]  Giovanni Tummarello,et al.  Toward Textual Encoding Based on RDF , 2005, ELPUB.

[2]  Alex Dekhtyar,et al.  A framework for management of concurrent XML markup , 2005, Data Knowl. Eng..

[3]  Arjen P. de Vries,et al.  Efficient XQuery Support for Stand-Off Annotation , 2006, XIME-P.

[4]  Jocelyne Nanard,et al.  Adding macroscopic semantics to anchors in knowledge-based hypertext , 1995, Int. J. Hum. Comput. Stud..

[5]  Erik Wilde Standard Generalized Markup Language (SGML) , 1999 .

[6]  C. M. Sperberg-McQueen,et al.  Guidelines for electronic text encoding and interchange , 1994 .

[7]  Peter Buneman,et al.  Towards a Query Language for Annotation Graphs , 2000, LREC.

[8]  Emmanuel Bruno,et al.  Describing and querying hierarchical XML structures defined over the same textual data , 2006, DocEng '06.

[9]  Sylvie Calabretto,et al.  Vers un environnement de gestion de documents à structures multiples , 2004, BDA.

[10]  Sylvie Calabretto,et al.  Semantic structuring of documents , 1997, Proceedings of the Third Basque International Workshop on Information Technology - BIWIT'97 - Data Management Systems.

[11]  Scott Boag,et al.  XQuery 1.0 : An XML Query Language , 2007 .

[12]  Wendell Piez,et al.  The Layered Markup and Annotation Language (LMNL) , 2002, Extreme Markup Languages®.

[13]  Andreas Witt,et al.  Making CONCUR work , 2005, Extreme Markup Languages®.

[14]  David Jouve,et al.  Modéliser la structuration multiple des documents , 2003 .

[15]  Laks V. S. Lakshmanan,et al.  Colorful XML: one hierarchy isn't enough , 2004, SIGMOD '04.

[16]  Jacques Le Maitre Describing multistructured XML documents by means of delay nodes , 2006, DocEng '06.

[17]  Mark Liberman,et al.  Annotation graphs as a framework for multidimensional linguistic data analysis , 1999, ArXiv.

[18]  Line Poullet Formaliser la sémantique des documents - Un modèle unificateur , 1997, INFORSID.