Integrating XML Data in the TARGITOLAP System

We present work on logical integration of OLAP and XML data sources, carried out in cooperation between TARGIT, a Danish OLAP client vendor, and Aalborg University. A prototype has been developed that allows XML data on the WWW to be used as dimensions and measures in the OLAP system in the same way as ordinary dimensions and measures, providing a powerful and flexible way to handle unexpected or short-term data requirements as well as rapidly changing data. Compared to earlier work, we present several major extensions that resulted from TARGIT's requirements. These include the ability to use XML data as measures, as well as a novel multigranular data model and query language that formalizes and extends the TARGIT data model and query language.

[1]  A Min Tjoa,et al.  Meta Cube-X: An XML Metadata Foundation for Interoperability Search among Web Data Warehouses , 2001, DMDW.

[2]  Serge Abiteboul,et al.  Managing an XML Warehouse in a P2P Context , 2003, CAiSE.

[3]  Sunita Sarawagi,et al.  Modeling multidimensional databases , 1997, Proceedings 13th International Conference on Data Engineering.

[4]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[5]  Elaheh Pourabbas,et al.  Hierarchies and relative operators in the OLAP environment , 2000, SGMD.

[6]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[7]  Alberto Abelló Gamazo YAM^2: a multidimensional conceptual model , 2002 .

[8]  Yosi Mass,et al.  Component Ranking and Automatic Query Refinement for XML Retrieval , 2004, INEX.

[9]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[10]  Owen Kaser,et al.  Analyzing Large Collections of Electronic Text Using OLAP , 2006, ArXiv.

[11]  Hamid Pirahesh,et al.  Extending XQuery for analytics , 2005, SIGMOD '05.

[12]  Matteo Golfarelli,et al.  WAND: A CASE Tool for Workload-Based Design of a Data Mart , 2002, SEBD.

[13]  Maurizio Rafanelli,et al.  Operators for Multidimensional Aggregate Data , 2003, Multidimensional Databases.

[14]  Elwood S. Buffa,et al.  Graph Theory with Applications , 1977 .

[15]  Serge Abiteboul Entrepôts de contenu autour de XML et des services Web , 2006, EDA.

[16]  Il-Yeol Song,et al.  Applying UML For Designing Multidimensional Databases And OLAP Applications , 2003, Advanced Topics in Database Research, Vol. 2.

[17]  SongIl-Yeol,et al.  A UML profile for multidimensional modeling in data warehouses , 2006 .

[18]  Omar Boussaïd,et al.  X-Warehousing: An XML-Based Approach for Warehousing Complex Data , 2006, ADBIS.

[19]  Ralph Kimball,et al.  The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses , 1996 .

[20]  Prabhakar Raghavan,et al.  Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies , 1998, The VLDB Journal.

[21]  Il-Yeol Song,et al.  An analysis of additivity in OLAP systems , 2004, DOLAP '04.

[22]  Alberto Abelló,et al.  Research in data warehouse modeling and design: dead or alive? , 2006, DOLAP '06.

[23]  A Min Tjoa,et al.  Modeling temporal consistency in data warehouses , 2001, 12th International Workshop on Database and Expert Systems Applications.

[24]  Jinho Lee,et al.  On the design and evaluation of a multi-dimensional approach to information retrieval. , 2000, SIGIR 2000.

[25]  Anthony C. Klug Equivalence of Relational Algebra and Relational Calculus Query Languages Having Aggregate Functions , 1982, JACM.

[26]  José Samos,et al.  Understanding facts in a multidimensional object-oriented model , 2001, DOLAP '01.

[27]  Deborah L. McGuinness,et al.  The Role of Frame-Based Representation on the Semantic Web , 2001 .

[28]  Anindya Datta,et al.  The cube data model: a conceptual model and algebra for on-line analytical processing in data warehouses , 1999, Decis. Support Syst..

[29]  Jianzhong Li,et al.  Xaggregation: Flexible Aggregation of XML Data , 2003, WAIM.

[30]  A Min Tjoa,et al.  An Object Oriented Multidimensional Data Model for OLAP , 2000, Web-Age Information Management.

[31]  George Colliat,et al.  OLAP, relational, and multidimensional database systems , 1996, SGMD.

[32]  Kaïs Khrouf Entrepôts de documents : de l'alimentation à l'exploitation , 2004 .

[33]  Wolfgang Hümmer,et al.  XCube: XML for data warehouses , 2003, DOLAP '03.

[34]  Olivier Teste,et al.  Algèbre OLAP et langage graphique , 2006, INFORSID.

[35]  Barbara Dinter,et al.  Extending the E/R Model for the Multidimensional Paradigm , 1998, ER Workshops.

[36]  Luca Cabibbo,et al.  Querying Multidimensional Databases , 1997, DBPL.

[37]  Boris Vrdoljak,et al.  Designing Web Warehouses from XML Schemas , 2003, DaWaK.

[38]  Torben Bach Pedersen,et al.  Aspects of Data Modeling and Query Processing for Complex Multidimensional Data , 2000 .

[39]  Michael A. Bender,et al.  The LCA Problem Revisited , 2000, LATIN.

[40]  Olivier Teste,et al.  A Multiversion-Based Multidimensional Model , 2006, DaWaK.

[41]  Chang Li,et al.  A data model for supporting on-line analytical processing , 1996, CIKM '96.

[42]  A Min Tjoa,et al.  MetaCube XTM: A Multidimensional Metadata Approach for Semantic Web Warehousing Systems , 2003, DaWaK.

[43]  Sihem Amer-Yahia,et al.  Tree Pattern Relaxation , 2002, EDBT.

[44]  A Min Tjoa,et al.  Building XML Data Warehouse Based on Frequent Patterns in User Queries , 2003, DaWaK.

[45]  Timo Niemi,et al.  Multidimensional Data Model and Query Language for Informetrics , 2003, J. Assoc. Inf. Sci. Technol..

[46]  José Samos,et al.  YAM2: a multidimensional conceptual model extending UML , 2006, Inf. Syst..

[47]  Nectaria Tryfona,et al.  starER: a conceptual model for data warehouse design , 1999, DOLAP '99.

[48]  Bernard Dousset,et al.  DocCube: Multi-dimensional visualisation and exploration of large document sets , 2003, J. Assoc. Inf. Sci. Technol..

[49]  Boris Vrdoljak,et al.  Integrating XML Sources into a Data Warehouse , 2006, DEECS.

[50]  Michael Stonebraker,et al.  Independent, Open Enterprise Data Integration , 1999, IEEE Data Eng. Bull..

[51]  Torben Bach Pedersen,et al.  XML-extended OLAP querying , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[52]  Yvan Bédard,et al.  SOLAP technology: Merging business intelligence with geospatial technology for interactive spatio-temporal exploration and analysis of data , 2005 .

[53]  David Taniar,et al.  On Building XML Data Warehouses , 2004, IDEAL.

[54]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[55]  Il-Yeol Song,et al.  A UML profile for multidimensional modeling in data warehouses , 2006, Data Knowl. Eng..

[56]  E. F. Codd,et al.  Relational Completeness of Data Base Sublanguages , 1972, Research Report / RJ / IBM / San Jose, California.

[57]  Torben Bach Pedersen,et al.  Evaluating XML-extended OLAP queries based on a physical algebra , 2004, DOLAP '04.

[58]  Norbert Fuhr,et al.  XIRQL: a query language for information retrieval in XML documents , 2001, SIGIR '01.

[59]  José Samos,et al.  Implementing operations to navigate semantic star schemas , 2003, DOLAP '03.

[60]  Luca Cabibbo,et al.  The Design and Development of a Logical System for OLAP , 2000, DaWaK.

[61]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[62]  Laks V. S. Lakshmanan,et al.  nD-SQL: A Multi-Dimensional Language for Interoperability and OLAP , 1998, VLDB.

[63]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[64]  Torben Bach Pedersen,et al.  Integrating XML data in the TARGIT OLAP system , 2004, Proceedings. 20th International Conference on Data Engineering.

[65]  Sergio Luján-Mora,et al.  Extending the UML for Multidimensional Modeling , 2002, UML.

[66]  Theodore Johnson,et al.  Extending complex ad-hoc OLAP , 1999, CIKM '99.

[67]  Tom Tourwé,et al.  Automated support for data exchange via XML , 2003, Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings..

[68]  José Samos,et al.  Understanding Analysis Dimensions in a Multidimensional Object-Oriented Model , 2001, DMDW.

[69]  Yu Li,et al.  Representing UML snowflake diagram from integrating XML data using XML schema , 2005, International Workshop on Data Engineering Issues in E-Commerce.

[70]  Michel Schneider Well-formed data warehouse structures , 2003, DMDW.

[71]  Rajesh Bordawekar,et al.  Analytical processing of XML documents: opportunities and challenges , 2005, SGMD.

[72]  Chantal Soulé-Dupuy,et al.  A Textual Warehouse Approach: A Web Data Repository , 2004 .

[73]  Tony Bain,et al.  Professional SQL Server 2000 Data Warehousing with Analysis Services , 2001 .

[74]  Matteo Golfarelli,et al.  The Dimensional Fact Model: A Conceptual Model for Data Warehouses , 1998, Int. J. Cooperative Inf. Syst..

[75]  Torben Bach Pedersen,et al.  Specifying OLAP Cubes on XML Data , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[76]  Peter Thanisch,et al.  Constructing an OLAP cube from distributed XML data , 2002, DOLAP '02.

[77]  Roberto J. Bayardo,et al.  Athena: Mining-Based Interactive Management of Text Database , 2000, EDBT.

[78]  Serge Abiteboul,et al.  The Xyleme project , 2002, Comput. Networks.

[79]  Jianzhong Li,et al.  OLAP for XML Data , 2005, The Fifth International Conference on Computer and Information Technology (CIT'05).

[80]  Gultekin Özsoyoglu,et al.  A language and a physical organization technique for summary tables , 1985, SIGMOD Conference.

[81]  Enrico Franconi,et al.  The GMD Data Model and Algebra for Multidimensional Information , 2004, CAiSE.

[82]  Peter Fankhauser,et al.  XML for data warehousing chances and challenges , 2003 .

[83]  Carsten Sapia,et al.  Automatically generating OLAP schemata from conceptual graphical models , 2000, DOLAP '00.

[84]  Arie Shoshani,et al.  Multidimensionality in Statistical, OLAP, and Scientific Databases , 2003, Multidimensional Databases.

[85]  Il-Yeol Song,et al.  A Taxonomy of Inaccurate Summaries and Their Management in OLAP Systems , 2005, ER.

[86]  Laks V. S. Lakshmanan,et al.  What can Hierarchies do for Data Warehouses? , 1999, VLDB.

[87]  Torben Bach Pedersen,et al.  A Powerful and SQL-Compatible Data Model and Query Language for OLAP , 2002, Australasian Database Conference.

[88]  Laks V. S. Lakshmanan,et al.  Tables as a paradigm for querying and restructuring (extended abstract) , 1996, PODS '96.

[89]  Gultekin Özsoyoglu,et al.  Extending relational algebra and relational calculus with set-valued attributes and aggregate functions , 1987, TODS.

[90]  Bernhard Thalheim,et al.  OLAP databases and aggregation functions , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[91]  Boris Vrdoljak,et al.  Data warehouse design from XML sources , 2001, DOLAP '01.

[92]  Marc H. Scholl,et al.  Extending Visual OLAP for Handling Irregular Dimensional Hierarchies , 2006, DaWaK.

[93]  Jiawei Han,et al.  Object-Based Selective Materialization for Efficient Implementation of Spatial Data Cubes , 2000, IEEE Trans. Knowl. Data Eng..

[94]  Robert E. Tarjan,et al.  Fast Algorithms for Finding Nearest Common Ancestors , 1984, SIAM J. Comput..

[95]  Peter Fankhauser,et al.  XML for Data Warehousing Chances and Challenges: (Extended Abstract) , 2003, DaWaK.

[96]  Torben Bach Pedersen,et al.  Query optimization for OLAP-XML federations , 2002, DOLAP '02.

[97]  Hyoil Han,et al.  XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses , 2005, DaWaK.

[98]  Tharam S. Dillon,et al.  XML Views: Part 1 , 2003, DEXA.

[99]  J. Wenny Rahayu,et al.  Conceptual Design of XML Document Warehouses , 2004, DaWaK.

[100]  Sabine Loudcher,et al.  A new OLAP aggregation based on the AHC technique , 2004, DOLAP '04.

[101]  Luca Cabibbo,et al.  From a procedural to a visual query language for OLAP , 1998, Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243).

[102]  Shi-Ming Huang,et al.  The Development of an XML-Based Data Warehouse System , 2002, IDEAL.

[103]  Esteban Zimányi,et al.  Hierarchies in a multidimensional model: From conceptual modeling to logical representation , 2006, Data Knowl. Eng..

[104]  Giinter von Biiltzingsloewen Translating and Optimizing SQL Queries Having Aggregates , 1987 .

[105]  Wolfgang Lehner,et al.  Modelling Large Scale OLAP Scenarios , 1998, EDBT.

[106]  Torben Bach Pedersen,et al.  A relevance-extended multi-dimensional model for a data warehouse contextualized with documents , 2005, DOLAP '05.

[107]  Christian S. Jensen,et al.  A foundation for capturing and querying complex multidimensional data , 2001, Inf. Syst..

[108]  J. Wenny Rahayu,et al.  An XML Document Warehouse Model , 2006, DASFAA.

[110]  Arie Shoshani,et al.  Summarizability in OLAP and statistical data bases , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[111]  Riccardo Torlone Conceptual Multidimensional Models , 2003, Multidimensional Databases.

[112]  Jaroslav Pokorný Modelling stars using XML , 2001, DOLAP '01.

[113]  R. Messaoud,et al.  Couplage de l'analyse en ligne et de la fouille de données pour l'exploration, l'agrégation et l'explication des données complexes , 2006 .

[114]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[115]  Laura M. Haas,et al.  The Garlic project , 1996, SIGMOD '96.

[116]  Stefano Spaccapietra,et al.  Spatio-temporal conceptual models: data structures + space + time , 1999, GIS '99.

[117]  Ioana Manolescu,et al.  XML warehousing meets sociology , 2005 .

[118]  Laks V. S. Lakshmanan,et al.  A Foundation for Multi-dimensional Databases , 1997, VLDB.

[119]  Frank S. C. Tseng,et al.  The concept of document warehousing for multi-dimensional modeling of textual-based business intelligence , 2006, Decis. Support Syst..