A Holistic Approach for Query Evaluation andResult Vocalization in Voice-Based OLAP

We focus on the problem of answering OLAP queries via voice output. We present a holistic approach that combines query processing and result vocalization. We use the following key ideas to minimize processing overheads and maximize answer quality. First, our approach samples from the database to evaluate alternative speech fragments. OLAP queries are not fully evaluated. Instead, sampling focuses on result aspects that are relevant for voice output. To guide sampling, we rely on methods from the area of Monte-Carlo Tree Search. Second, we use pipelining to interleave query processing and voice output. The system starts providing the user with high-level insights while generating more fine-grained results in the background. Third, we optimize speech output to maximize the user's information gain under speaking time constraints. We use a maximum-entropy model to predict the user's belief about OLAP results, after listening to voice output. Based on that model, we select the most informative speech fragments (i.e., the ones minimizing the distance between user belief and actual data). We analyze formal properties of the proposed speech structure and analyze complexity of our algorithm. Also, we compare alternative vocalization approaches in an extensive user study.

[1]  Michèle Sebag,et al.  The grand challenge of computer Go , 2012, Commun. ACM.

[2]  Immanuel Trummer,et al.  Vocalizing Large Time Series Efficiently , 2018, Proc. VLDB Endow..

[3]  Wai Yu,et al.  Using Non-speech Sounds to Improve Access to 2D Tabular Numerical Information for Visually Impaired Users , 2001, BCS HCI/IHM.

[4]  Jiancheng Zhu,et al.  Data Vocalization: Optimizing Voice Output of Relational Data , 2017, Proc. VLDB Endow..

[5]  Sunita Sarawagi,et al.  User-Adaptive Exploration of Multidimensional Data , 2000, VLDB.

[6]  Jae-Gil Lee,et al.  Sampling cube: a framework for statistical olap over sampling data , 2008, SIGMOD Conference.

[7]  Arun Kumar,et al.  SpeakQL: Towards Speech-driven Multi-modal Querying , 2017, HILDA@SIGMOD.

[8]  Volker Markl,et al.  M4: A Visualization-Oriented Time Series Data Aggregation , 2014, Proc. VLDB Endow..

[9]  Arnab Nandi,et al.  InfiniViz: Interactive Visual Exploration using Progressive Bin Refinement , 2017, ArXiv.

[10]  Davide Rocchesso,et al.  The Sonification Handbook , 2011 .

[11]  Ben Shneiderman,et al.  Response time and display rate in human performance with computers , 1984, CSUR.

[12]  Rokia Missaoui,et al.  Towards intensional answers to OLAP queries for analytical sessions , 2012, DOLAP '12.

[13]  Feng Yu,et al.  Compressed data cube for approximate OLAP query processing , 2008, Journal of Computer Science and Technology.

[14]  Aditya G. Parameswaran,et al.  Smart Drill-Down: A New Data Exploration Operator , 2015, Proc. VLDB Endow..

[15]  Silviu Guiasu,et al.  The principle of maximum entropy , 1985 .

[16]  Jeffrey Heer,et al.  The Effects of Interactive Latency on Exploratory Visual Analysis , 2014, IEEE Transactions on Visualization and Computer Graphics.

[17]  Aditya G. Parameswaran,et al.  Smart Drill Down , 2014, ArXiv.

[18]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[19]  Robert B. Miller,et al.  Response time in man-computer conversational transactions , 1899, AFIPS Fall Joint Computing Conference.

[20]  Jignesh M. Patel,et al.  DAQ: A New Paradigm for Approximate Query Processing , 2015, Proc. VLDB Endow..

[21]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[22]  Chris Jermaine,et al.  Materialized Sample Views for Database Approximation , 2008, IEEE Transactions on Knowledge and Data Engineering.

[23]  Ronitt Rubinfeld,et al.  Rapid Sampling for Visualizations with Ordering Guarantees , 2014, Proc. VLDB Endow..

[24]  Carsten Binnig,et al.  Making the Case for Query-by-Voice with EchoQuery , 2016, SIGMOD Conference.

[25]  Ruoming Jin,et al.  New Sampling-Based Estimators for OLAP Queries , 2006, 22nd International Conference on Data Engineering (ICDE'06).