A scalable query materialization algorithm for interactive data exploration

Data exploration is the process of competently digging out insights from stored data even if the user doesn't know or uncertain on what exactly he want. Iterative interactions of the user with search systems can help to achieve these goals; Interactive data exploration (IDE) is one such system, supporting data exploration by simply incorporating user feedback on retrieve data. These systems are the key ingredient of many discoveries and recall-oriented real-life applications such as Scientific Computing, Financial analysis, Social Analytics, Evidence-based medicine etc. IDE supports user's navigation through the query to query transition in data space, via exploratory session'. The exploratory session consists of often long, complex and analytical Queries. When processed against a large and multi-faceted data, hence consume a lot of time for processing. The user's original query can be decomposed into multiple candidate queries, ‘checkpoint queries’, and will be selected for materialization. In his paper, we introduced the notion of ‘checkpoint queries’, also discussed how heuristics query frequency and query result overlap ratio (QROR) are used in a greedy selection of these candidate queries from an exploratory session. It is observed, checkpoint queries proved a good decision for significant improvement in query processing time, as materialized checkpoint queries can be used in the query answering in query navigation. In this process, we face some daunting tasks; one key challenge is the selection of checkpoint queries for the materialization and how to improve the query reuse for query answering. In spite of the simplicity, our algorithm selects queries which give us better performance than views that selected by existing algorithms.

[1]  Abdul Wasay,et al.  Queriosity: Automated Data Exploration , 2015, 2015 IEEE International Congress on Big Data.

[2]  Surajit Chaudhuri,et al.  Overview of Data Exploration Techniques , 2015, SIGMOD Conference.

[3]  David Maier,et al.  Query From Examples: An Iterative, Data-Driven Approach to Query Construction , 2015, Proc. VLDB Endow..

[4]  Moshé M. Zloof Query by example , 1975, AFIPS '75.

[5]  Bahar Qarabaqi,et al.  User-driven refinement of imprecise queries , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[6]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[7]  Neoklis Polyzotis,et al.  Query Recommendations for Interactive Database Exploration , 2009, SSDBM.

[8]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[9]  Barry Smyth,et al.  An Analysis of Query Similarity in Collaborative Web Search , 2005, ECIR.

[10]  H. V. Jagadish,et al.  Guided interaction , 2011, VLDB 2011.

[11]  Stratos Idreos,et al.  dbTouch: Analytics at your Fingertips , 2013, CIDR.

[12]  Zohra Bellahsene,et al.  A survey of view selection methods , 2012, SGMD.

[13]  Surajit Chaudhuri,et al.  Discovering queries based on example tuples , 2014, SIGMOD Conference.

[14]  Evaggelia Pitoura,et al.  YmalDB: exploring relational databases via result-driven recommendations , 2013, The VLDB Journal.

[15]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize in a Data Warehouse , 2005, IEEE Trans. Knowl. Data Eng..

[16]  Martin L. Kersten,et al.  The researcher's guide to the data deluge , 2011, Proc. VLDB Endow..

[17]  Olga Papaemmanouil,et al.  Explore-by-example: an automatic query steering framework for interactive data exploration , 2014, SIGMOD Conference.

[18]  Srinivasan Parthasarathy,et al.  Query by output , 2009, SIGMOD Conference.

[19]  C. A. Dhote,et al.  Materialized View Selection in Data Warehousing: A Survey , 2009 .

[20]  Brandeis Hill A lattice framework for reusing top-k query results , 2005, IRI -2005 IEEE International Conference on Information Reuse and Integration, Conf, 2005..

[21]  Abraham Silberschatz,et al.  Playful Query Specification with DataPlay , 2012, Proc. VLDB Endow..

[22]  Martin L. Kersten,et al.  Meet Charles, big data query advisor , 2013, CIDR.

[23]  Stanley B. Zdonik,et al.  Query Steering for Interactive Data Exploration , 2013, CIDR.

[24]  Toby J. Teorey,et al.  A progressive view materialization algorithm , 1999, DOLAP '99.

[25]  Guoliang Li,et al.  Interactive SQL query suggestion: Making databases user-friendly , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[26]  Arnab Nandi,et al.  Querying Without Keyboards , 2013, CIDR.