Specifying OLAP Cubes on XML Data

On-Line Analytical Processing (OLAP) enables analysts to gain insight into data through fast and interactive access to a variety of possible views on information, organized in a dimensional model. The demand for data integration is rapidly becoming larger as more and more information sources appear in modern enterprises. In the data warehousing approach, selected information is extracted in advance and stored in a repository. This approach is used because of its high performance. However, in many situations a logical (rather than physical) integration of data is preferable. Previous Web-based data integration efforts have focused almost exclusively on the logical level of data models, creating a need for techniques focused on the conceptual level. Also, previous integration techniques for Web-based data have not addressed the special needs of OLAP tools such as handling dimensions with hierarchies. Extensible Markup Language (XML) is fast becoming the new standard for data representation and exchange on the World Wide Web. The rapid emergence of XML data on the Web, e.g., business-to-business (B2B) e-commerce, is making it necessary for OLAP and other data analysis tools to handle XML data as well as traditional data formats. Based on a real-world case study, the paper presents an approach to the conceptual specification of OLAP DBs based on Web data. Unlike previous work, this approach takes special OLAP issues such as dimension hierarchies and correct aggregation of data into account. Additionally, an integration architecture that allows the logical integration of XML and relational data sources for use by OLAP tools is presented.

[1]  Jennifer Widom,et al.  Ozone: Integrating Structured and Semistructured Data , 1999, DBPL.

[2]  David J. DeWitt,et al.  Relational Databases for Querying XML Documents: Limitations and Opportunities , 1999, VLDB.

[3]  Serge Abiteboul,et al.  Querying Semi-Structured Data , 1997, Encyclopedia of Database Systems.

[4]  Serge Abiteboul,et al.  Tools for Data Translation and Integration , 1999, IEEE Data Eng. Bull..

[5]  Arie Shoshani,et al.  Summarizability in OLAP and statistical data bases , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[6]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.

[7]  Torben Bach Pedersen,et al.  Converting XML DTDs to UML diagrams for conceptual data integration , 2001, Data Knowl. Eng..

[8]  Dan Suciu,et al.  Declarative specification of Web sites with Strudel , 2000, The VLDB Journal.

[9]  R. B. オ-ステンフェルド The Data Warehouse , 1997 .

[10]  Arie Shoshani,et al.  STORM: A Statistical Object Representation Model , 1990, IEEE Data Eng. Bull..

[11]  Daniela Florescu,et al.  Storing and Querying XML Data using an RDMBS , 1999, IEEE Data Eng. Bull..

[12]  Torben Bach Pedersen,et al.  Extending Practical Pre-Aggregation in On-Line Analytical Processing , 1999, VLDB.

[13]  Michael Stonebraker,et al.  Independent, Open Enterprise Data Integration , 1999, IEEE Data Eng. Bull..

[14]  Laura M. Haas,et al.  The Garlic project , 1996, SIGMOD '96.