A Conceptual Model for Multidimensional Analysis of Documents

Data warehousing and OLAP are mainly used for the analysis of transactional data. Nowadays, with the evolution of Internet, and the development of semi-structured data exchange format (such as XML), it is possible to consider entire fragments of data such as documents as analysis sources. As a consequence, an adapted multidimensional analysis framework needs to be provided. In this paper, we introduce an OLAP multidimensional conceptual model without facts. This model is based on the unique concept of dimensions and is adapted for multidimensional document analysis. We also provide a set of manipulation operations.

[1]  Hyoil Han,et al.  XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses , 2005, DaWaK.

[2]  José Samos,et al.  Implementing operations to navigate semantic star schemas , 2003, DOLAP '03.

[3]  Olivier Teste,et al.  Olap aggregation function for textual data warehouse , 2016, ICEIS.

[4]  Olivier Teste,et al.  Algebraic and Graphic Languages for OLAP Manipulations , 2008, Int. J. Data Warehous. Min..

[5]  Matteo Golfarelli,et al.  WAND: A CASE Tool for Workload-Based Design of a Data Mart , 2002, SEBD.

[6]  S C TsengFrank,et al.  The concept of document warehousing for multi-dimensional modeling of textual-based business intelligence , 2006 .

[7]  Dan Sullivan,et al.  Document Warehousing and Text Mining: Techniques for Improving Business Operations, Marketing, and Sales , 2001 .

[8]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[9]  Esteban Zimányi,et al.  Hierarchies in a multidimensional model: From conceptual modeling to logical representation , 2006, Data Knowl. Eng..

[10]  Jinho Lee,et al.  On the design and evaluation of a multi-dimensional approach to information retrieval (poster session) , 2000, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[11]  J. Wenny Rahayu,et al.  Conceptual Design of XML Document Warehouses , 2004, DaWaK.

[12]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit , 2009 .

[13]  Ophir Frieder,et al.  On the design and evaluation of a multi-dimensional approach to information retrieval (poster session) , 2000, SIGIR '00.

[14]  Torben Bach Pedersen,et al.  Evaluating XML-extended OLAP queries based on a physical algebra , 2004, DOLAP '04.

[15]  Luca Cabibbo,et al.  A Systematic Approach to Multidimensional Databases , 1997, SEBD.

[16]  Frank Shou-Cheng Tseng Design of a multi-dimensional query expression for document warehouses , 2005, Inf. Sci..

[17]  Sunita Sarawagi,et al.  Modeling multidimensional databases , 1997, Proceedings 13th International Conference on Data Engineering.

[18]  Torben Bach Pedersen,et al.  Specifying OLAP Cubes on XML Data , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[19]  Norbert Fuhr,et al.  XIRQL: a query language for information retrieval in XML documents , 2001, SIGIR '01.

[20]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[21]  Omar Boussaïd,et al.  X-Warehousing: An XML-Based Approach for Warehousing Complex Data , 2006, ADBIS.

[22]  Maurizio Rafanelli,et al.  Operators for Multidimensional Aggregate Data , 2003, Multidimensional Databases.

[23]  Maurizio Rafanelli Multidimensional Databases: Problems and Solutions , 2003 .

[24]  Alberto Abelló,et al.  Research in data warehouse modeling and design: dead or alive? , 2006, DOLAP '06.

[25]  Yosi Mass,et al.  Component Ranking and Automatic Query Refinement for XML Retrieval , 2004, INEX.

[26]  Torben Bach Pedersen,et al.  Contextualizing data warehouses with documents , 2008, Decis. Support Syst..

[27]  Owen Kaser,et al.  Analyzing Large Collections of Electronic Text Using OLAP , 2006, ArXiv.

[28]  Laks V. S. Lakshmanan,et al.  A Foundation for Multi-dimensional Databases , 1997, VLDB.

[29]  Frank S. C. Tseng,et al.  The concept of document warehousing for multi-dimensional modeling of textual-based business intelligence , 2006, Decis. Support Syst..

[30]  Riccardo Torlone Conceptual Multidimensional Models , 2003, Multidimensional Databases.

[31]  Chantal Soulé-Dupuy,et al.  A Textual Warehouse Approach: A Web Data Repository , 2004 .

[32]  Bernard Dousset,et al.  DocCube: Multi-dimensional visualisation and exploration of large document sets , 2003, J. Assoc. Inf. Sci. Technol..

[33]  Boris Vrdoljak,et al.  Integrating XML Sources into a Data Warehouse , 2006, DEECS.