Materializing Web Data for OLAP and DSS

Business decisions must rely not only on company-internal data but also on external data from competitors or relevant events. This information can be obtained from the WWW but must be integrated with the data in a company's data warehouse. In this paper we discuss a system architecture for warehousing Web content for OLAP and DSS. A self-describing object model is used to make the implicit modeling and context assumptions explicit, both for the data obtained from the Web and the data already in the data warehouse. A domain-specific ontology provides a common interpretation basis for data and metadata. We propose an object-relational mapping that takes into consideration the peculiarities of relational data warehouses based on a star schema and propose a mapping rule language to describe the necessary transformation rules. The system framework described in this paper has been implemented in Java.

[1]  Yan Zhu,et al.  A Framework for Warehousing the Web Contents , 1999, ICSC.

[2]  Wolfgang Keller Object/Relational Access Layers , 1998, EuroPLoP.

[3]  Arthur M. Keller,et al.  Persistence software: bridging object-oriented programming and relational databases , 1993, SIGMOD '93.

[4]  Diego Calvanese,et al.  Description Logic Framework for Information Integration , 1998, KR.

[5]  Gershon Elber,et al.  WebSuite: A Tool Suite for Harnessing Web Data , 1998, WebDB.

[6]  Alexandros Labrinidis,et al.  On the Materialization of WebViews , 1999, WebDB.

[7]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[8]  Diego Calvanese,et al.  Information integration: conceptual modeling and reasoning support , 1998, Proceedings. 3rd IFCIS International Conference on Cooperative Information Systems (Cat. No.98EX122).

[9]  Diego Calvanese,et al.  Query Answering Using Views for Data Integration over the Web , 1999, WebDB.

[10]  Sourav S. Bhowmick,et al.  Web warehousing: an algebra for web information , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[11]  Christof Bornhövd,et al.  A Prototype for Metadata-Based Integration of Internet Sources , 1999, CAiSE.

[12]  Jennifer Widom,et al.  Integrating dynamically-fetched external information into a DBMS for semistructured data , 1997, SGMD.

[13]  I. V. Ramakrishnan,et al.  A layered architecture for querying dynamic Web content , 1999, SIGMOD '99.

[14]  Craig A. Knoblock,et al.  Ariadne: a system for constructing mediators for Internet sources , 1998, SIGMOD '98.

[15]  Philip A. Bernstein,et al.  Context-Based Prefetch for Implementing Objects on Relations , 1999, VLDB.

[16]  Alon Y. Halevy,et al.  Declarative Web Site Management with Tiramisu , 1999, WebDB.

[17]  Rob Mattison,et al.  Web Warehousing and Knowledge Management , 1997 .

[18]  C. Bornhovd Semantic metadata for the integration of Web-based data for electronic commerce , 1999, Proceedings of International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems. (Cat. No.PR00334).

[19]  Sourav S. Bhowmick,et al.  Web Warehousing: Design and Issues , 1998, ER Workshops.

[20]  Terence Critchlow,et al.  Meta-data based mediator generation , 1998, Proceedings. 3rd IFCIS International Conference on Cooperative Information Systems (Cat. No.98EX122).

[21]  Terence Critchlow,et al.  Automatic Generation of Warehouse Mediators Using an Ontology Engine , 1998, KRDB.

[22]  Michael J. Carey,et al.  O-O, What Have They Done to DB2? , 1999, VLDB.

[23]  Craig A. Knoblock,et al.  Selectively materializing data in mediators by analyzing user queries , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[24]  Richard D. Hackathorn,et al.  Web Farming for the Data Warehouse , 1998 .