dbTouch: Analytics at your Fingertips

As we enter the era of data deluge, turning data into knowledge has become the major challenge across most sciences and businesses that deal with data. In addition, as we increase our ability to create data, more and more people are confronted with data management problems on a daily basis for numerous aspects of every day life. A fundamental need is data exploration through interactive tools, i.e., being able to quickly and eortlessly determine data and patterns of interest. However, modern database systems have not been designed with data exploration and usability in mind; they require users with expert knowledge and skills, while they react in a strict and monolithic way to every user request, resulting in correct answers but slow response times. In this paper, we introduce the vision of a new generation of data management systems, called dbTouch; our vision is to enable interactive and intuitive data exploration via database kernels which are tailored for touch-based exploration. No expert knowledge is needed. Data is represented in a visual format, e.g., a column shape for an attribute or a fat rectangle shape for a table, while users can touch those shapes and interact/query with gestures as opposed to ring complex SQL queries. The system does not try to consume all data; instead it analyzes only parts of the data at a time, continuously rening the answers and continuously reacting to user input. Every single touch on a data object can be seen as a request to run an operator or a collection of operators over part of the data. Users react to running results and continuously adjust the data exploration - they continuously determine the data to be processed next by adjusting the direction and speed of a gesture, i.e., a collection of touches; the database system does not have control on the data ow anymore. We discuss the various benets

[1]  Anastasia Ailamaki,et al.  NoDB: efficient query execution on raw data files , 2012, Commun. ACM.

[2]  Carlo Curino,et al.  Relational Cloud: a Database Service for the cloud , 2011, CIDR.

[3]  Martin L. Kersten,et al.  SciBORQ: Scientific data management with Bounds On Runtime and Quality , 2011, CIDR.

[4]  Gerhard Weikum,et al.  Unbundling Transaction Services in the Cloud , 2009, CIDR.

[5]  Pat Hanrahan,et al.  Polaris: a system for query, analysis and visualization of multi-dimensional relational databases , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[6]  Pat Hanrahan Analytic database technologies for a new kind of user: the data enthusiast , 2012, SIGMOD Conference.

[7]  Pat Hanrahan,et al.  Query, analysis, and visualization of hierarchically structured data using Polaris , 2002, KDD.

[8]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..

[9]  Martin Grund,et al.  An overview of HYRISE - a Main Memory Hybrid Storage Engine , 2012, IEEE Data Eng. Bull..

[10]  Alfons Kemper,et al.  HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[11]  Rob Miller,et al.  Crowdsourced Databases: Query Processing with People , 2011, CIDR.

[12]  Martin L. Kersten,et al.  Self-organizing tuple reconstruction in column-stores , 2009, SIGMOD Conference.

[13]  Aditya G. Parameswaran,et al.  Fuzzy Joins Using MapReduce , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[14]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[15]  Ion Stoica,et al.  Blink and It's Done: Interactive Queries on Very Large Data , 2012, Proc. VLDB Endow..

[16]  Suman Nath,et al.  Rethinking Database Algorithms for Phase Change Memory , 2011, CIDR.

[17]  Martin L. Kersten,et al.  Database Cracking , 2007, CIDR.

[18]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[19]  Martin L. Kersten,et al.  Updating a cracked database , 2007, SIGMOD '07.

[20]  Harumi A. Kuno,et al.  Concurrency Control for Adaptive Indexing , 2012, Proc. VLDB Endow..

[21]  Kai-Uwe Sattler,et al.  Data3 -- A Kinect Interface for OLAP Using Complex Event Processing , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[22]  Jennifer Widom,et al.  CrowdScreen: algorithms for filtering data with humans , 2012, SIGMOD Conference.

[23]  Samuel Madden,et al.  The Case for RodentStore: An Adaptive, Declarative Storage System , 2009, CIDR.

[24]  Harumi A. Kuno,et al.  Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column-Stores , 2011, Proc. VLDB Endow..

[25]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[26]  Peter Triantafillou Anthropocentric data systems , 2011, Proc. VLDB Endow..

[27]  Martin L. Kersten,et al.  MonetDB: Two Decades of Research in Column-oriented Database Architectures , 2012, IEEE Data Eng. Bull..

[28]  Philip A. Bernstein,et al.  Hyder - A Transactional Record Manager for Shared Flash , 2011, CIDR.

[29]  Martin L. Kersten,et al.  The researcher's guide to the data deluge , 2011, Proc. VLDB Endow..

[30]  Chris Jermaine,et al.  Online aggregation for large MapReduce jobs , 2011, Proc. VLDB Endow..

[31]  Babak Falsafi,et al.  Database Servers on Chip Multiprocessors: Limitations and Opportunities , 2007, CIDR.

[32]  Ryan Johnson,et al.  Here are my Data Files. Here are my Queries. Where are my Results? , 2011, CIDR.

[33]  Dan Suciu,et al.  PerfXplain: Debugging MapReduce Job Performance , 2012, Proc. VLDB Endow..

[34]  Jignesh M. Patel,et al.  Towards Eco-friendly Database Management Systems , 2009, CIDR.

[35]  Anastasia Ailamaki,et al.  NoDB in Action: Adaptive Query Processing on Raw Data , 2012, Proc. VLDB Endow..

[36]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[37]  Parthasarathy Ranganathan,et al.  Energy Efficiency: The New Holy Grail of Data Management Systems Research , 2009, CIDR.

[38]  Surajit Chaudhuri,et al.  What next?: a half-dozen data management research goals for big data and the cloud , 2012, PODS.

[39]  Alekh Jindal,et al.  Towards a One Size Fits All Database Architecture , 2011, CIDR.

[40]  Alon Y. Halevy,et al.  Crowdsourcing systems on the World-Wide Web , 2011, Commun. ACM.

[41]  Jens Teubner,et al.  Data Processing on FPGAs , 2013, Proc. VLDB Endow..

[42]  Roland H. C. Yap,et al.  Stochastic Database Cracking: Towards Robust Adaptive Indexing in Main-Memory Column-Stores , 2012, Proc. VLDB Endow..

[43]  Jae-Gil Lee,et al.  Business Analytics in (a) Blink , 2012, IEEE Data Eng. Bull..

[44]  H. V. Jagadish,et al.  Guided Interaction: Rethinking the Query-Result Paradigm , 2011, Proc. VLDB Endow..