Approximate Query Engines: Commercial Challenges and Research Opportunities

Recent years have witnessed a surge of interest in Approximate Query Processing (AQP) solutions, both in academia and the commercial world. In addition to well-known open problems in this area, there are many new research challenges that have surfaced as a result of the first interaction of AQP technology with commercial and real-world customers. We categorize these into deployment, planning, and interface challenges. At the same time, AQP settings introduce many interesting opportunities that would not be possible in a database with precise answers. These opportunities create hopes for overcoming some of the major limitations of traditional database systems. For example, we discuss how a database can reuse its past work in a generic way, and become smarter as it answers new queries. Our goal in this talk is to suggest some of the exciting research directions in this field that are worth pursuing.

[1]  Bin Wu,et al.  Wander Join: Online Aggregation via Random Walks , 2016, SIGMOD Conference.

[2]  Neeraj Kumar,et al.  SnappyData: A Hybrid Transactional Analytical Store Built On Spark , 2016, SIGMOD Conference.

[3]  Sridhar Ramaswamy,et al.  The Aqua approximate query answering system , 1999, SIGMOD '99.

[4]  Sally A. McKee,et al.  Reflections on the memory wall , 2004, CF '04.

[5]  Barzan Mozafari,et al.  SnappyData: A Unified Cluster for Streaming, Transactions and Interactice Analytics , 2017, CIDR.

[6]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[7]  Carlo Curino,et al.  DBSeer: Resource and Performance Prediction for Building a Next Generation Database Cloud , 2013, CIDR.

[8]  Surajit Chaudhuri,et al.  Optimized stratified sampling for approximate query processing , 2007, TODS.

[9]  Michael J. Cafarella,et al.  Visualization-aware sampling for very large databases , 2015, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[10]  Barzan Mozafari,et al.  CliffGuard: A Principled Framework for Finding Robust Database Designs , 2015, SIGMOD Conference.

[11]  Robert B. Miller,et al.  Response time in man-computer conversational transactions , 1899, AFIPS Fall Joint Computing Conference.

[12]  Florin Rusu,et al.  PF-OLA: a high-performance framework for parallel online aggregation , 2012, Distributed and Parallel Databases.

[13]  Michael J. Cafarella,et al.  Database Learning: Toward a Database that Becomes Smarter Every Time , 2017, SIGMOD Conference.

[14]  Ameet Talwalkar,et al.  Knowing when you're wrong: building fast and reliable approximate query processing systems , 2014, SIGMOD Conference.

[15]  Ion Stoica,et al.  G-OLA: Generalized On-Line Aggregation for Interactive Analysis on Big Data , 2015, SIGMOD Conference.

[16]  Jignesh M. Patel,et al.  DAQ: A New Paradigm for Approximate Query Processing , 2015, Proc. VLDB Endow..

[17]  Barzan Mozafari,et al.  A Handbook for Building an Approximate Query Engine , 2015, IEEE Data Eng. Bull..

[18]  Hong Su,et al.  Approximate Aggregates in Oracle 12C , 2016, CIKM.

[19]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[20]  Fei Xu,et al.  Turbo-Charging Estimate Convergence in DBO , 2009, Proc. VLDB Endow..

[21]  Carlo Curino,et al.  Performance and resource modeling in highly-concurrent OLTP workloads , 2013, SIGMOD '13.

[22]  Sridhar Ramaswamy,et al.  Join synopses for approximate query answering , 1999, SIGMOD '99.

[23]  Joobin Choobineh,et al.  An object-oriented semantic data model , 1990 .

[24]  Ion Stoica,et al.  BlinkDB: queries with bounded errors and bounded response times on very large data , 2012, EuroSys '13.

[25]  Thomas Neumann,et al.  Efficiently Compiling Efficient Query Plans for Modern Hardware , 2011, Proc. VLDB Endow..

[26]  Barzan Mozafari Verdict: A System for Stochastic Query Planning , 2015, CIDR.

[27]  Chris Jermaine,et al.  Online aggregation for large MapReduce jobs , 2011, Proc. VLDB Endow..

[28]  Viswanath Poosala,et al.  Aqua: A Fast Decision Support Systems Using Approximate Query Answers , 1999, VLDB.