The Case for Predictive Database Systems: Opportunities and Challenges

This paper argues that next generation database management systems should incorporate a predictive model management component to effectively support both inward-facing applications, such as self management, and user-facing applications such as data-driven predictive analytics. We draw an analogy between model management and data management functionality and discuss how model management can leverage profiling, physical design and query optimization techniques, as well as the pertinent challenges. We then describe the early design and architecture of Longview, a predictive DBMS prototype that we are building at Brown, along with a case study of how models can be used to predict query execution performance.

[1]  Steven C. Wheelwright,et al.  Forecasting methods and applications. , 1979 .

[2]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[3]  Steven C. Wheelwright,et al.  Forecasting: Methods and Applications, 3rd Edition , 1998 .

[4]  Michael Mitzenmacher,et al.  Probability And Computing , 2005 .

[5]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[6]  Rob J Hyndman,et al.  25 Years of Iif Time Series Forecasting: A Selective Review , 2005 .

[7]  Jeffrey S. Chase,et al.  Active and accelerated learning of cost models for optimizing scientific applications , 2006, VLDB.

[8]  Samuel Madden,et al.  MauveDB: supporting model-based user views in database systems , 2006, SIGMOD Conference.

[9]  Raghu Ramakrishnan,et al.  Optimizing mpf queries: decision support and probabilistic inference , 2007, SIGMOD '07.

[10]  Shivnath Babu,et al.  Processing Forecasting Queries , 2007, VLDB.

[11]  Ugur Çetintemel,et al.  Declarative temporal data models for sensor-driven query processing , 2007, DMSN '07.

[12]  Kamesh Munagala,et al.  Modeling and exploiting query interactions in database systems , 2008, CIKM '08.

[13]  Jennie Duggan,et al.  Simultaneous Equation Systems for Query Processing on Continuous-Time Data Streams , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Chris Jermaine,et al.  Materialized Sample Views for Database Approximation , 2008, IEEE Transactions on Knowledge and Data Engineering.

[15]  Stanley B. Zdonik,et al.  A skip-list approach for efficiently processing forecasting queries , 2008, Proc. VLDB Endow..

[16]  Peter J. Haas,et al.  MCDB: a monte carlo approach to managing uncertain data , 2008, SIGMOD Conference.

[17]  Herodotos Herodotou,et al.  Automated Experiment-Driven Management of (Database) Systems , 2009, HotOS.

[18]  Archana Ganapathi,et al.  Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[19]  Eli Upfal,et al.  Database-support for continuous prediction queries over streaming data , 2010, Proc. VLDB Endow..

[20]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.