Formal Framework of XML Document Schema Design

Designing "good" XML documents is a very difficult task for a database designer. Although many theories for XML database design have proposed, none of commercial design tool for XML document design has been developed to assist the XML document designer. In this paper, the authors present a formal framework of XML document design by incorporating a conceptual model of XML schema called Graph-Document Type Definition G-DTD with a theory of database normalization. This framework is designed as a blueprint to help the XML database designers to perform the XML document schema design quickly and accurately. The G-DTD is used to describe the structure of XML documents at the schema level. A set of normal forms for G-DTD on the basis of rules proposed by Arenas and Libkin and Lv. et al is used to provide a guideline to a well-designed schema for XML documents. They develop a prototype of XML document schema design using a Z formal specification language. Finally, using a case study, this formal specification is validated to check for correctness and consistency of the specification. Thus, this gives a confidence that the authors' prototype can be implemented successfully to generate an automatic XML document design.

[1]  Jonathan P. Bowen Formal Specification and Documentation Using Z: A Case Study Approach , 1996 .

[2]  Bing Wang,et al.  XML Document Normalization Using GN-DTD , 2010, International Conference on Internet Computing.

[3]  Antoni Diller,et al.  Z - an introduction to formal methods , 1990 .

[4]  Alon Y. Halevy,et al.  Updating XML , 2001, SIGMOD '01.

[5]  David W. Embley,et al.  Developing XML Documents with Guaranteed "Good" Properties , 2001, ER.

[6]  Marcelo Arenas,et al.  Normalization theory for XML , 2006, SGMD.

[7]  Bing Wang,et al.  GN-DTD: Graphical Notations for Describing XML Documents , 2010, 2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications.

[8]  Cong Yu,et al.  XML schema refinement through redundancy detection and normalization , 2008, The VLDB Journal.

[9]  Sin Yeung Lee,et al.  Designing Good Semi-Structured Databases and Conceptual Modeling , 1999, ER.

[10]  Chengfei Liu,et al.  Strong functional dependencies and their application to normal forms in XML , 2004, TODS.

[11]  Tadeusz Pankowski,et al.  Transformation of XML Data into XML Normal Form , 2009, Informatica.

[12]  Solmaz Kolahi,et al.  XML design for relational storage , 2007, WWW '07.

[13]  Jignesh M. Patel,et al.  Storing and Querying XML Data in Object-Relational DBMSs , 2002, EDBT Workshops.

[14]  Klaus-Dieter Schewe,et al.  Redundancy, Dependencies and Normal Forms for XML Databases , 2005, ADC.

[15]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[16]  Wenfei Fan,et al.  Reasoning about keys for XML , 2003, Inf. Syst..

[17]  Solmaz Kolahi,et al.  Dependency-preserving normalization of relational and XML data , 2007, J. Comput. Syst. Sci..

[18]  Solmaz Kolahi,et al.  On redundancy vs dependency preservation in normalization: an information-theoretic study of 3NF , 2006, PODS '06.

[19]  Zongmin Ma,et al.  Fuzzy XML data modeling with the UML and relational data models , 2007, Data Knowl. Eng..

[20]  Tharam S. Dillon,et al.  A semantic network-based design methodology for XML documents , 2002, TOIS.

[21]  LV Teng Normal Forms for XML Documents , 2004 .

[22]  Leonid Libkin Normalization Theory for XML , 2007, XSym.

[23]  Anne Sara,et al.  Next generation search engines: advanced models for information retrieval , 2013 .

[24]  Weidong Yang,et al.  XKFitler: A Keyword Filter on XML Stream , 2011, Int. J. Inf. Retr. Res..

[25]  David W. Embley,et al.  A normal form for precisely characterizing redundancy in nested relations , 1996, TODS.

[26]  Rafael A. Gonzalez Exploring Information Management Problems in the Domain of Critical Incidents , 2008 .

[27]  Junhu Wang,et al.  Removing XML Data Redundancies Using Functional and Equality-Generating Dependencies , 2005, ADC.

[28]  Wenfei Fan,et al.  Constraints for semistructured data and XML , 2001, SGMD.

[29]  David Lightfoot Formal Specification Using Z , 1991 .

[30]  Ajantha Dahanayake,et al.  Personalized Information Retrieval and Access: Concepts, Methods and Practices , 2008 .

[31]  Dongwon Lee,et al.  Semantic Data Modeling Using XML Schemas , 2001, ER.

[32]  Marcelo Arenas,et al.  A normal form for XML documents , 2004, TODS.

[33]  Gavin Powell Beginning XML Databases , 2006 .

[34]  Gillian Dobbie,et al.  Theorem prover approach to semistructured data design , 2010, Formal Methods Syst. Des..

[35]  J. Michael Spivey,et al.  The Z notation - a reference manual , 1992, Prentice Hall International Series in Computer Science.

[36]  Komal Kumar Bhatia,et al.  International Journal of Information Retrieval Research , 2011 .