Towards interactive analytics and visualization on one billion tweets

We present a system called "Cloudberry" that allows users to interactively query, analyze, and visualize large amounts of data with temporal, spatial, and textual dimensions. As a general-purpose full-stack solution, it has a friendly UI, intelligent middleware, and a powerful big data management backend running Apache AsterixDB to enable big data analytics and visualization. We will demonstrate the system using Twitter data on a computer cluster.

[1]  Pat Hanrahan,et al.  Polaris: a system for query, analysis and visualization of multi-dimensional relational databases , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[2]  Mohamed F. Mokbel,et al.  Demonstration of Taghreed: A system for querying, analyzing, and visualizing geotagged microblogs , 2014, 2015 IEEE 31st International Conference on Data Engineering.

[3]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[4]  Jin Chen,et al.  A Visualization System for Space-Time and Multivariate Patterns (VIS-STAMP) , 2006, IEEE Transactions on Visualization and Computer Graphics.

[5]  Thomas Ertl,et al.  ScatterBlogs2: Real-Time Monitoring of Microblog Messages through User-Guided Filtering , 2013, IEEE Transactions on Visualization and Computer Graphics.

[6]  Rares Vernica,et al.  Hyracks: A flexible and extensible foundation for data-intensive computing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[7]  Chen Li,et al.  AsterixDB: A Scalable, Open Source BDMS , 2014, Proc. VLDB Endow..

[8]  Mikael Jern,et al.  Geovisual analytics framework integrated with storytelling applied to HTML5 , 2013 .