Refining OEM to Improve Features of Query Languages for Semistructured Data

Semistructured data can be explained as “schemaless” or “self-describing”, indicating that there is no separate description of the type or structure of the data. This is in contrast with the structured approaches, such, e.g. relational databases, where the data structure is usually designed first and described as a database schema. Semistructured data is data whose structure is irregular, is heterogeneous, is partial, has not a fixed format, and evolves quickly. These characteristics are typical for data available in the Web (HTML pages, e-mail message bases, bookmarks collections etc). The research of semistructured data aimed at extending the database management techniques to semistructured data in the late 90’s (Suciu, 1998).