XML threatens to expand beyond its document markup origins to become the basis for data interchange on the Internet. One highly anticipated application of XML is the interchange of electronic data (EDI). Unlike existing Web documents, electronic data is primarily intended for computer, not human, consumption. For example, businesses could publish data about their products and services, and potential customers could compare and process this information automatically; business partners could exchange internal operational data between their information systems on secure channels; search robots could integrate automatically information from related sources that publish their data in XML format, like stock quotes from financial sites, sports scores from news sites. New opportunities will arise for third parties to add value by integrating, transforming, cleaning, and aggregating XML data. Once it becomes pervasive, it’s not hard to imagine that many information sources will structure their external view as a repository of XML data, no matter what their internal storage mechanisms. Data exchange between applications will then be in XML format. What is then the role of a query language in this world? One could see it as a local adjunct to a browsing capability, providing a more expressive “find” command over one or more retrieved documents. Or it might serve as a souped-up version of XPointer, allowing richer forms of logical reference to portions of documents. Neither of these modes of use is very “databasey”. From the database viewpoint, the enticing role of an XML query language is as a tool for structural and content-based query that allows an application to extract precisely the information it needs from one or several XML data sources. One salient question is why not adapt SQL or OQL to query XML. The answer is that XML data is fundamentally different from relational and object-oriented data, and therefore, neither SQL nor OQL is appropriate for XML. The key distinction between data in XML and data in traditional models is that XML is not rigidly structured. In the relational and object-oriented models, every data instance has a schema, which is separate from and independent of the data. In XML, the schema exists with the data. Thus, XML data is self-describing and can naturally model irregularities that cannot be modeled by relational or object-oriented data. For example, data items may have missing elements or multiple occurrences of the same element; elements may have atomic values in some data items and structured values in others; and collections of elements can have heterogeneous structure. Even XML data that has an associated DTD is self-describing (the data can still be parsed, even if
[1]
Jennifer Widom,et al.
Object exchange across heterogeneous information sources
,
1995,
Proceedings of the Eleventh International Conference on Data Engineering.
[2]
SuciuDan,et al.
A query language and optimization techniques for unstructured data
,
1996
.
[3]
Dan Suciu,et al.
A query language and optimization techniques for unstructured data
,
1996,
SIGMOD '96.
[4]
Jeffrey D. Ullman,et al.
Representative objects: concise representations of semistructured, hierarchical data
,
1997,
Proceedings 13th International Conference on Data Engineering.
[5]
Serge Abiteboul,et al.
Inferring structure in semistructured data
,
1997,
SGMD.
[6]
Jennifer Widom,et al.
The Lorel query language for semistructured data
,
1997,
International Journal on Digital Libraries.
[7]
Dan Suciu,et al.
Adding Structure to Unstructured Data
,
1997,
ICDT.
[8]
Serge Abiteboul,et al.
Querying Semi-Structured Data
,
1997,
Encyclopedia of Database Systems.
[9]
Roy Goldman,et al.
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
,
1997,
VLDB.
[10]
Peter Buneman,et al.
Semistructured data
,
1997,
PODS.
[11]
Dan Suciu,et al.
An overview of semistructured data
,
1998,
SIGA.
[12]
Alin Deutsch,et al.
XML-QL: A Query Language for XML
,
1998
.
[13]
Alin Deutsch,et al.
A Query Language for XML
,
1999,
Comput. Networks.
[14]
Dan Suciu,et al.
Catching the boat with Strudel: experiences with a Web-site management system
,
1998,
SIGMOD '98.
[15]
Thomas Kistler,et al.
WebL - A Programming Language for the Web
,
1998,
Comput. Networks.
[16]
Steven J. DeRose,et al.
XML Path Language (XPath)
,
1999
.
[17]
Jennifer Widom,et al.
Query Optimization for XML
,
1999,
VLDB.
[18]
Roy Goldman,et al.
From Semistructured Data to XML: Migrating the Lore Data Model and Query Language
,
1999,
Markup Lang..
[19]
Letizia Tanca,et al.
XML-GL: A Graphical Language for Querying and Restructuring XML Documents
,
1999,
SEBD.