StreamStory: Exploring Multivariate Time Series on Multiple Scales

This paper presents an approach for the interactive visualization, exploration and interpretation of large multivariate time series. Interesting patterns in such datasets usually appear as periodic or recurrent behavior often caused by the interaction between variables. To identify such patterns, we summarize the data as conceptual states, modeling temporal dynamics as transitions between the states. This representation can visualize large datasets with potentially billions of examples. We extend the representation to multiple spatial granularities allowing the user to find patterns on multiple scales. The result is an interactive web-based tool called StreamStory. StreamStory couples the abstraction with several tools that map the abstractions back to domain-specific concepts using techniques from statistics and machine learning. It is aimed at users who are not experts in data analytics, minimizing the number of parameters to configure out-of-the-box. We use three real-world datasets to demonstrate how StreamStory can be used to perform three main visual analytics tasks: identify the main states of a complex system and map them back to data-specific concepts, find high-level and long-term periodic behavior and traverse the scales to identify which scales exhibit interesting phenomena. We find and interpret several known, as well as previously unknown patterns in these datasets.

[1]  Guy Melançon,et al.  Multiscale visualization of small world networks , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[2]  Gerik Scheuermann,et al.  Knowledge Assisted Visualization: Steady visualization of the dynamics in fluids using ε-machines , 2009 .

[3]  Jimeng Sun,et al.  DICON: Interactive Visual Analysis of Multidimensional Clusters , 2011, IEEE Transactions on Visualization and Computer Graphics.

[4]  Helwig Hauser,et al.  Parallel Sets: interactive exploration and visual analysis of categorical data , 2006, IEEE Transactions on Visualization and Computer Graphics.

[5]  W. Marsden I and J , 2012 .

[6]  Brian Everitt,et al.  Cluster analysis , 1974 .

[7]  Lucy T. Nowell,et al.  ThemeRiver: visualizing theme changes over time , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[8]  S. Vasanthi,et al.  An empirical study on stock index trend prediction using markov chain analysis , 2011 .

[9]  Alfred Inselberg,et al.  Parallel coordinates for visualizing multi-dimensional geometry , 1987 .

[10]  Roger D. Peng,et al.  A Method for Visualizing Multivariate Time Series Data , 2008 .

[11]  Hans-Peter Kriegel,et al.  Visualization Techniques for Mining Large Databases: A Comparison , 1996, IEEE Trans. Knowl. Data Eng..

[12]  Michael I. Jordan,et al.  Revisiting k-means: New Algorithms via Bayesian Nonparametrics , 2011, ICML.

[13]  Tie Liu Application of Markov Chains to Analyze and Predict the Time Series , 2010 .

[14]  Ben Shneiderman,et al.  Dynamic Query Tools for Time Series Data Sets: Timebox Widgets for Interactive Exploration , 2004, Inf. Vis..

[15]  Ben Shneiderman,et al.  Tree visualization with tree-maps: 2-d space-filling approach , 1992, TOGS.

[16]  Wai-Ki Ching,et al.  Multivariate Markov chain models for production planning , 2011, Int. J. Intell. Eng. Informatics.

[17]  Ben Shneiderman,et al.  Interactively Exploring Hierarchical Clustering Results , 2002, Computer.

[18]  Tommi Kärkkäinen,et al.  Visualizing Time Series State Changes with Prototype Based Clustering , 2009, ICANNGA.

[19]  Matthew D. Cooper,et al.  Revealing Structure within Clustered Parallel Coordinates Displays , 2005, INFOVIS.

[20]  M. Sheelagh T. Carpendale,et al.  A Review of Temporal Data Visualizations Based on Space-Time Cube Operations , 2014, EuroVis.

[21]  Cláudio T. Silva,et al.  Interactive Vector Field Feature Identification , 2010, IEEE Transactions on Visualization and Computer Graphics.

[22]  Michael Stonebraker,et al.  DataSplash: A Direct Manipulation Environment for Programming Semantic Zoom Visualizations of Tabular Data , 2001, J. Vis. Lang. Comput..

[23]  Yang Xiang,et al.  Visualizing Clusters in Parallel Coordinates for Visual Knowledge Discovery , 2012, PAKDD.

[24]  Dunja Mladenic,et al.  Visualization of Text Document Corpus , 2005, Informatica.

[25]  Zhi-Li Zhang,et al.  Commute Times for a Directed Graph using an Asymmetric Laplacian , 2011 .

[26]  Jonathan C. Roberts,et al.  Angular Histograms: Frequency-Based Visualizations for Large, High Dimensional Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[27]  Matthew O. Ward,et al.  Hierarchical parallel coordinates for exploration of large datasets , 1999, Proceedings Visualization '99 (Cat. No.99CB37067).

[28]  V. Soloviev,et al.  Markov Chains application to the financial-economic time series prediction , 2011, 1111.5254.

[29]  D. Vere-Jones Markov Chains , 1972, Nature.

[30]  Steven Franconeri,et al.  The Connected Scatterplot for Presenting Paired Time Series , 2016, IEEE Transactions on Visualization and Computer Graphics.

[31]  Jing Hua,et al.  Exemplar-based Visualization of Large Document Corpus (InfoVis2009-1115) , 2009, IEEE Transactions on Visualization and Computer Graphics.

[32]  Gerik Scheuermann,et al.  Visualization of High-Dimensional Point Clouds Using Their Density Distribution's Topology , 2011, IEEE Transactions on Visualization and Computer Graphics.

[33]  Heidrun Schumann,et al.  Heterogeneity-based guidance for exploring multiscale data in systems biology , 2012, 2012 IEEE Symposium on Biological Data Visualization (BioVis).

[34]  F. Takens Detecting strange attractors in turbulence , 1981 .

[35]  Matej Novotny,et al.  Visually Effective Information Visualization of Large Data , 2004 .

[36]  Chao Wang,et al.  A New Multivariate Markov Chain Model for Adding a New Categorical Data Sequence , 2014 .

[37]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[38]  Heidrun Schumann,et al.  Visualization of Time-Oriented Data , 2011, Human-Computer Interaction Series.

[39]  Luis Gustavo Nonato,et al.  Local Affine Multidimensional Projection , 2011, IEEE Transactions on Visualization and Computer Graphics.

[40]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Han-Wei Shen,et al.  Multiscale Time Activity Data Exploration via Temporal Clustering Visualization Spreadsheet , 2009, IEEE Transactions on Visualization and Computer Graphics.

[42]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[43]  Wai-Ki Ching,et al.  On Multi-dimensional Markov Chain Models , 2007 .

[44]  Ian Davidson,et al.  Visualizing Clustering Results , 2002, SDM.

[45]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[46]  Alexandru Telea,et al.  Dynamic multiscale visualization of flight data , 2015, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).

[47]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[48]  Lawrence O. Hall,et al.  Visualizing fuzzy points in parallel coordinates , 2003, IEEE Trans. Fuzzy Syst..

[49]  Pierre Dragicevic,et al.  Time Curves: Folding Time to Visualize Patterns of Temporal Evolution in Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[50]  Michael Burch,et al.  Timeline trees: visualizing sequences of transactions in information hierarchies , 2008, AVI '08.

[51]  Sean P. Meyn,et al.  Optimal Kullback-Leibler Aggregation via Spectral Theory of Markov Chains , 2011, IEEE Transactions on Automatic Control.

[52]  Ben Shneiderman,et al.  Temporal Summaries: Supporting Temporal Categorical Searching, Aggregation and Comparison , 2009, IEEE Transactions on Visualization and Computer Graphics.

[53]  Alfred Inselberg,et al.  Parallel coordinates: a tool for visualizing multi-dimensional geometry , 1990, Proceedings of the First IEEE Conference on Visualization: Visualization `90.

[54]  P. Chebotarev,et al.  On of the Spectra of Nonsymmetric Laplacian Matrices , 2004, math/0508176.

[55]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[56]  Tie Liu Application of Markov Chains to Analyze and Predict the Time Series , 2010 .

[57]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[58]  Yi Gu,et al.  TransGraph: Hierarchical Exploration of Transition Relationships in Time-Varying Volumetric Data , 2011, IEEE Transactions on Visualization and Computer Graphics.

[59]  Gerik Scheuermann,et al.  Steady visualization of the dynamics in fluids using epsilon-machines , 2009, Comput. Graph..

[60]  Maruška Mole Study of the properties of air flow over orographic barrier , 2017 .

[61]  Pat Hanrahan,et al.  Multiscale visualization using data cubes , 2002, IEEE Symposium on Information Visualization, 2002. INFOVIS 2002..

[62]  Klaus Mueller,et al.  A Structure-Based Distance Metric for High-Dimensional Space Exploration with Multidimensional Scaling , 2014, IEEE Trans. Vis. Comput. Graph..