Analyzing textual documents with new OLAP operators

As the amount of data grows very fast inside and outside of an enterprise, it is getting important to analyze both of them for getting total business intelligence. While online analytical processing (OLAP) techniques have been proven very useful for analyzing structured data, they face challenges in handling unstructured data. To this end, new multidimensional models have been proposed for OLAP purposes. Nevertheless, there is no proposal allowing managing both document structures and the semantics of the textual content. In our previous work, we proposed to integrate the entire document within a Diamond multi-dimensional model. In this paper, based on our proposed model, we provide new OLAP operators that take into account the specificities of this model.

[1]  Omar Boussaïd,et al.  CXT-cube: contextual text cube model and aggregation operator for text OLAP , 2013, DOLAP '13.

[2]  Olivier Teste,et al.  Olap aggregation function for textual data warehouse , 2016, ICEIS.

[3]  Shimei Pan,et al.  Interactive, topic-based visual text summarization and analysis , 2009, CIKM.

[4]  Olivier Teste,et al.  A Conceptual Model for Multidimensional Analysis of Documents , 2007, ER.

[5]  Anne Laurent,et al.  Bien cube, les données textuelles peuvent s'agréger ! , 2010, EGC.

[6]  Jiebo Luo,et al.  Visual cube and on-line analytical processing of images , 2010, CIKM '10.

[7]  Bo Zhao,et al.  Text Cube: Computing IR Measures for Multidimensional Text Database Analysis , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[8]  Koichi Takeda,et al.  A method for online analytical processing of text data , 2007, CIKM '07.

[9]  Jamel Feki,et al.  Diamond multidimensional model and aggregation operators for document OLAP , 2015, 2015 IEEE 9th International Conference on Research Challenges in Information Science (RCIS).

[10]  Bouakkaz Mustapha,et al.  Automatic textual aggregation approach of scientific articles in OLAP context , 2014, 2014 10th International Conference on Innovations in Information Technology (IIT).

[11]  Olivier Teste,et al.  Top_Keyword: An Aggregation Function for Textual Document OLAP , 2008, DaWaK.

[12]  Sabine Loudcher,et al.  A new OLAP aggregation based on the AHC technique , 2004, DOLAP '04.