Towards Integrated Data Analytics: Time Series Forecasting in DBMS

Integrating sophisticated statistical methods into database management systems is gaining more and more attention in research and industry in order to be able to cope with increasing data volume and increasing complexity of the analytical algorithms. One important statistical method is time series forecasting, which is crucial for decision making processes in many domains. The deep integration of time series forecasting offers additional advanced functionalities within a DBMS. More importantly, however, it allows for optimizations that improve the efficiency, consistency, and transparency of the overall forecasting process. To enable efficient integrated forecasting, we propose to enhance the traditional 3-layer ANSI/SPARC architecture of a DBMS with forecasting functionalities. This article gives a general overview of our proposed enhancements and presents how forecast queries can be processed using an example from the energy data management domain. We conclude with open research topics and challenges that arise in this area.

[1]  Peter J. Haas,et al.  Ricardo: integrating R and Hadoop , 2010, SIGMOD Conference.

[2]  V. S. Subrahmanian,et al.  Embedding Forecast Operators in Databases , 2011, SUM.

[3]  Rob J Hyndman,et al.  25 years of time series forecasting , 2006 .

[4]  Wolfgang Lehner,et al.  F2DB: The Flash-Forward Database System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[5]  Rob J Hyndman,et al.  A state space framework for automatic forecasting using exponential smoothing methods , 2002 .

[6]  Rob J. Hyndman,et al.  Optimal combination forecasts for hierarchical time series , 2011, Comput. Stat. Data Anal..

[7]  Wolfgang Lehner,et al.  Efficient In-Database Maintenance of ARIMA Models , 2011, SSDBM.

[8]  W. H. Williams,et al.  Aggregate Versus Subaggregate Models in Local Area Forecasting , 1976 .

[9]  Nick Roussopoulos The Logical Access Path Schema of a Database , 1982, IEEE Transactions on Software Engineering.

[10]  Samuel Madden,et al.  MauveDB: supporting model-based user views in database systems , 2006, SIGMOD Conference.

[11]  Torben Bach Pedersen,et al.  Data management in the MIRABEL smart grid system , 2012, EDBT-ICDT '12.

[12]  Wolfgang Lehner,et al.  Forcasting Evolving Time Series of Energy Demand and Supply , 2011, ADBIS.

[13]  Wolfgang Lehner,et al.  SAP HANA database: data management for modern business applications , 2012, SGMD.

[14]  Christopher Ré,et al.  Incrementally Maintaining Classification using an RDBMS , 2011, Proc. VLDB Endow..

[15]  James W. Taylor,et al.  Triple seasonal methods for short-term electricity demand forecasting , 2010, Eur. J. Oper. Res..

[16]  Bo Xu,et al.  Time-series prediction with applications to traffic and moving objects databases , 2003, MobiDe '03.

[17]  Datong Chen,et al.  Forecasting high-dimensional data , 2010, SIGMOD Conference.

[18]  Ismael Sánchez,et al.  Adaptive combination of forecasts with application to wind energy , 2008 .

[19]  Richard Winter,et al.  Large scale data warehousing: Trends and observations , 2010, ICDE.

[20]  Wolfgang Lehner,et al.  Indexing forecast models for matching and maintenance , 2010, IDEAS '10.

[21]  Clive W. J. Granger,et al.  Short-run forecasts of electricity loads and peaks , 2001 .

[22]  Wolfgang Lehner,et al.  Offline Design Tuning for Hierarchies of Forecast Models , 2011, BTW.

[23]  Wolfgang Lehner,et al.  Context-Aware Parameter Estimation for Forecast Models in the Energy Domain , 2011, SSDBM.

[24]  Stanley B. Zdonik,et al.  A skip-list approach for efficiently processing forecasting queries , 2008, Proc. VLDB Endow..

[25]  Wolfgang Lehner,et al.  Drift-Aware Ensemble Regression , 2009, MLDM.

[26]  Wolfgang Lehner,et al.  Bridging two worlds with RICE , 2011, Proc. VLDB Endow..

[27]  Christian S. Jensen,et al.  Path prediction and predictive range querying in road network databases , 2010, The VLDB Journal.

[28]  Joseph M. Hellerstein,et al.  MAD Skills: New Analysis Practices for Big Data , 2009, Proc. VLDB Endow..

[29]  Shivnath Babu,et al.  Processing Forecasting Queries , 2007, VLDB.

[30]  Wolfgang Lehner,et al.  Partitioning and Multi-core Parallelization of Multi-equation Forecast Models , 2012, SSDBM.

[31]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .