A client-based visual analytics framework for large spatiotemporal data under architectural constraints

A primary aim of visual analytics is to provide end-users interactive and scalable environments to facilitate their decision making tasks. Researchers have often utilized several server-client solutions to support interactive data exploration (e.g., building the data cube, parallelizing data processing). However, these solutions can suffer from scalability issues especially in the absence of adequate computation functionality provided by servers. Organizational policies can also prohibit the transfer of data to external data servers because of security or budgetary concerns; thereby, severely limiting the capability of the visual analytic systems. Therefore, in this paper, we propose an interactive client-based visual analytics framework for large-scale spatiotemporal data. The proposed framework follows a sampling based incremental visual analysis approach to sustain the real-time responsiveness, meanwhile, with affordable computation resources in a client machine. General sampling methods [34] preprocess the entire dataset to build data indexing, which can bring the client unaffordable computation overhead. Instead, our framework proposes a novel data management model, using the spatiotemporal clustering pattern to predictively organize and sample data based on historical data acquisition activities. We demonstrate the capabilities and usefulness of our framework by applying it on crime data and Twitter data. We also conduct several experimental evaluations to determine the efficacy of our framework.

[1]  Carlo H. Séquin,et al.  Adaptive display algorithm for interactive frame rates during visualization of complex virtual environments , 1993, SIGGRAPH.

[2]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[3]  Renato Pajarola,et al.  Out-Of-Core Algorithms for Scientific Visualization and Computer Graphics , 2002 .

[4]  Monica M. C. Schraefel,et al.  Trust me, i'm partially right: incremental visualization lets analysts explore large datasets faster , 2012, CHI.

[5]  Panos Kalnis,et al.  Indexing spatio-temporal data warehouses , 2002, Proceedings 18th International Conference on Data Engineering.

[6]  Bolin Ding,et al.  Trust, but Verify: Optimistic Visualizations of Approximate Queries for Exploring Big Data , 2017, CHI.

[7]  Michael Stonebraker,et al.  Dynamic Prefetching of Data Tiles for Interactive Visualization , 2016, SIGMOD Conference.

[8]  Bernd Hamann,et al.  Real-time out-of-core visualization of particle traces , 2001, Proceedings IEEE 2001 Symposium on Parallel and Large-Data Visualization and Graphics (Cat. No.01EX520).

[9]  Pat Hanrahan,et al.  Maintaining interactivity while exploring massive time series , 2008, 2008 IEEE Symposium on Visual Analytics Science and Technology.

[10]  Dominik Moritz,et al.  What Users Don't Expect about Exploratory Data Analysis on Approximate Query Processing Systems , 2017, HILDA@SIGMOD.

[11]  David S. Ebert,et al.  A correlative analysis process in a visual analytics environment , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[12]  M. B. Peterson Intelligence-Led Policing: The New Intelligence Architecture , 2005 .

[13]  Ioana Manolescu,et al.  EdiFlow: Data-intensive interactive workflows for visual analytics , 2010, 2011 IEEE 27th International Conference on Data Engineering.

[14]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[15]  Paolo Cignoni,et al.  Adaptive tetrapuzzles: efficient out-of-core construction and visualization of gigantic multiresolution polygonal models , 2004, ACM Trans. Graph..

[16]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[17]  Hamid Pirahesh,et al.  Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals , 1996, Data Mining and Knowledge Discovery.

[18]  Carlos Eduardo Scheidegger,et al.  Hashedcubes: Simple, Low Memory, Real-Time Visual Exploration of Big Data , 2017, IEEE Transactions on Visualization and Computer Graphics.

[19]  Samuel Madden,et al.  TrajStore: An adaptive storage system for very large trajectory data sets , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[20]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[21]  Danyel Fisher,et al.  Big data exploration requires collaboration between visualization and data infrastructures , 2016, HILDA '16.

[22]  Lu Wang,et al.  Spatial Online Sampling and Aggregation , 2015, Proc. VLDB Endow..

[23]  Anthony J. Lattanze Architecting Software Intensive Systems: A Practitioners Guide , 2008 .

[24]  V. A. Epanechnikov Non-Parametric Estimation of a Multivariate Probability Density , 1969 .

[25]  Suman Nath,et al.  Mercury: A memory-constrained spatio-temporal real-time search on microblogs , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[26]  Abraham Silberschatz,et al.  Operating System Concepts , 1983 .

[27]  Jeffrey Heer,et al.  imMens: Real‐time Visual Querying of Big Data , 2013, Comput. Graph. Forum.

[28]  Surajit Chaudhuri,et al.  Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee , 2016, SIGMOD Conference.

[29]  Michael Cox,et al.  Application-controlled demand paging for out-of-core visualization , 1997 .

[30]  David S. Ebert,et al.  Visual Analytics Law Enforcement Toolkit , 2010, 2010 IEEE International Conference on Technologies for Homeland Security (HST).

[31]  Suman Nath,et al.  Mars: Real-time spatio-temporal queries on microblogs , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[32]  Carlos Eduardo Scheidegger,et al.  Nanocubes for Real-Time Exploration of Spatiotemporal Datasets , 2013, IEEE Transactions on Visualization and Computer Graphics.