StreamingCube: A Unified Framework for Stream Processing and OLAP Analysis

In most streaming applications, the data streams need to be analyzed continuously to make instant decisions exploiting latest information. Often data streams are multidimensional and are at the low-level of abstraction, whereas analysts are interested in multi-level interactive analysis of data streams across several dimensions. On-line analytical processing (OLAP) is a proven technique for such analysis of static data and has also been studied by some researchers for data streams. Traditionally this is achieved by coupling a stream processing engine with an OLAP engine. We believe that coupling multiple systems is not an efficient solutions as it results in lower performance (due to the transfer of data between multiple systems), resource wastage (due to replication of data for each coupled system) and increased complexity and maintenance cost. To this end, we present StreamingCube, a unified framework for data stream processing and its interactive OLAP analysis. The proposed framework possesses all the essential operators to process data streams and introduces a new operator, cubify, to maintain OLAP lattice nodes (materialized views) incrementally. The novelty of the introduced cubify operator lies in the incremental maintenance of the materialized views. To demonstrate StreamingCube, a web-based GUI has been developed which enables users to register continuous queries (CQs). Once a CQ has been registered, users can perform different OLAP operations through the GUI for the interactive analysis. The results of the OLAP queries/operations are displayed in the form of tables and graphs.

[1]  Yixin Chen,et al.  Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams , 2005, Distributed and Parallel Databases.

[2]  Barzan Mozafari,et al.  SnappyData: A Unified Cluster for Streaming, Transactions and Interactice Analytics , 2017, CIDR.

[3]  Huan Liu,et al.  Discovering Location Information in Social Media , 2015, IEEE Data Eng. Bull..

[4]  Badrish Chandramouli,et al.  Trill: A High-Performance Incremental Query Processor for Diverse Analytics , 2014, Proc. VLDB Endow..

[5]  Patrick Wendell,et al.  Learning Spark: Lightning-Fast Big Data Analytics , 2015 .

[6]  Lieven Eeckhout,et al.  Performance Evaluation and Benchmarking , 2005 .

[7]  Jennifer Widom,et al.  STREAM: The Stanford Data Stream Management System , 2016, Data Stream Management.

[8]  V. S. Subrahmanian,et al.  Maintaining views incrementally , 1993, SIGMOD Conference.

[9]  Hiroyuki Kitagawa,et al.  An architecture for stream OLAP exploiting SPE and OLAP engine , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[10]  Milos Nikolic,et al.  How to Win a Hot Dog Eating Contest: Distributed Incremental View Maintenance with Batch Updates , 2016, SIGMOD Conference.

[11]  Philip S. Yu,et al.  Content-based filtering for efficient online materialized view maintenance , 2008, CIKM '08.

[12]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[13]  Milos Nikolic,et al.  DBToaster: Higher-order Delta Processing for Dynamic, Frequently Fresh Views , 2012, Proc. VLDB Endow..

[14]  Xuedong Chen,et al.  The Star Schema Benchmark and Augmented Fact Table Indexing , 2009, TPCTC.

[15]  Andrey Balmin,et al.  Jaql , 2011, Proc. VLDB Endow..