Visualizing Time-Dependent Data Using Dynamic t-SNE

Many interesting processes can be represented as time-dependent datasets. We define a time-dependent dataset as a sequence of datasets captured at particular time steps. In such a sequence, each dataset is composed of observations (high-dimensional real vectors), and each observation has a corresponding observation across time steps. Dimensionality reduction provides a scalable alternative to create visualizations (projections) that enable insight into the structure of such datasets. However, applying dimensionality reduction independently for each dataset in a sequence may introduce unnecessary variability in the resulting sequence of projections, which makes tracking the evolution of the data significantly more challenging. We show that this issue affects t-SNE, a widely used dimensionality reduction technique. In this context, we propose dynamic t-SNE, an adaptation of t-SNE that introduces a controllable trade-off between temporal coherence and projection reliability. Our evaluation in two time-dependent datasets shows that dynamic t-SNE eliminates unnecessary temporal variability and encourages smooth changes between projections.

[1]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[2]  Heidrun Schumann,et al.  Visualization of Time-Oriented Data , 2011, Human-Computer Interaction Series.

[3]  Michel Verleysen,et al.  Nonlinear dimensionality reduction of data manifolds with essential loops , 2005, Neurocomputing.

[4]  Loet Leydesdorff,et al.  Dynamic animations of journal maps: Indicators of structural changes and interdisciplinary developments , 2009, J. Assoc. Inf. Sci. Technol..

[5]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[6]  Paulo E. Rauber,et al.  Visualizing the Hidden Activity of Artificial Neural Networks , 2017, IEEE Transactions on Visualization and Computer Graphics.

[7]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[8]  Valerio Pascucci,et al.  Distortion‐Guided Structure‐Driven Interactive Exploration of High‐Dimensional Data , 2014, Comput. Graph. Forum.

[9]  Michel Verleysen,et al.  Stability Comparison of Dimensionality Reduction Techniques Attending to Data and Parameter Variations , 2013, VAMP@EuroVis.

[10]  Alfred O. Hero,et al.  A regularized graph layout framework for dynamic network visualization , 2012, Data Mining and Knowledge Discovery.

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[13]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[14]  Tobias Schreck,et al.  TimeSeriesPaths : Projection-Based Explorative Analysis of Multivariate Time Series Data , 2012, WSCG 2012.

[15]  Maria Cristina Ferreira de Oliveira,et al.  Time-aware visualization of document collections , 2012, SAC '12.

[16]  Daniel A. Keim,et al.  Temporal MDS Plots for Analysis of Multivariate Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[17]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[18]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[19]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[20]  Valerio Pascucci,et al.  Visualizing High-Dimensional Data: Advances in the Past Decade , 2017, IEEE Transactions on Visualization and Computer Graphics.