An Incremental Dimensionality Reduction Method for Visualizing Streaming Multidimensional Data

Dimensionality reduction (DR) methods are commonly used for analyzing and visualizing multidimensional data. However, when data is a live streaming feed, conventional DR methods cannot be directly used because of their computational complexity and inability to preserve the projected data positions at previous time points. In addition, the problem becomes even more challenging when the dynamic data records have a varying number of dimensions as often found in real-world applications. This paper presents an incremental DR solution. We enhance an existing incremental PCA method in several ways to ensure its usability for visualizing streaming multidimensional data. First, we use geometric transformation and animation methods to help preserve a viewer's mental map when visualizing the incremental results. Second, to handle data dimension variants, we use an optimization method to estimate the projected data positions, and also convey the resulting uncertainty in the visualization. We demonstrate the effectiveness of our design with two case studies using real-world datasets.

[1]  J. Hartigan Printer graphics for clustering , 1975 .

[2]  Paulo E. Rauber,et al.  Visualizing the Hidden Activity of Artificial Neural Networks , 2017, IEEE Transactions on Visualization and Computer Graphics.

[3]  Y. Takane,et al.  Multidimensional Scaling I , 2015 .

[4]  Ingo Hotz,et al.  iPCA : An Interactive System for PCA-based Visual Analytics , 2008 .

[5]  Anil K. Jain,et al.  Incremental nonlinear dimensionality reduction by manifold learning , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  E. Oja,et al.  On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix , 1985 .

[7]  Klaus Mueller,et al.  A framework to visualize temporal behavioral relationships in streaming multivariate data , 2016, 2016 New York Scientific Data Summit (NYSDS).

[8]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[9]  Flora S. Tsai Dimensionality reduction techniques for blog visualization , 2011, Expert Syst. Appl..

[10]  R. Bro,et al.  Resolving the sign ambiguity in the singular value decomposition , 2008 .

[11]  Wei Chen,et al.  An online visualization system for streaming log data of computing clusters , 2013 .

[12]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[13]  Michael Jünger,et al.  Drawing Large Graphs with a Potential-Field-Based Multilevel Algorithm , 2004, GD.

[14]  Luis Gustavo Nonato,et al.  Multidimensional Projection for Visual Analytics: Linking Techniques with Distortions, Tasks, and Layout Enrichment , 2019, IEEE Transactions on Visualization and Computer Graphics.

[15]  Juyang Weng,et al.  Candid Covariance-Free Incremental Principal Component Analysis , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Tamara Munzner,et al.  Steerable, Progressive Multidimensional Scaling , 2004, IEEE Symposium on Information Visualization.

[17]  Alfredo Cuzzocrea,et al.  Parallel Coordinates Technique in Visual Data Mining: Advantages, Disadvantages and Combinations , 2013, 2013 17th International Conference on Information Visualisation.

[18]  Matthew Chalmers,et al.  Fast Multidimensional Scaling Through Sampling, Springs and Interpolation , 2003, Inf. Vis..

[19]  Wei Chen,et al.  ViDX: Visual Diagnostics of Assembly Line Performance in Smart Factories , 2017, IEEE Transactions on Visualization and Computer Graphics.

[20]  Yu-Ru Lin,et al.  Voila: Visual Anomaly Detection and Monitoring with Streaming Spatiotemporal Data , 2018, IEEE Transactions on Visualization and Computer Graphics.

[21]  Reyer Zwiggelaar,et al.  Open Problems in Spectral Dimensionality Reduction , 2014, SpringerBriefs in Computer Science.

[22]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[23]  I. Jolliffe Principal Component Analysis and Factor Analysis , 1986 .

[24]  Kwan-Liu Ma,et al.  An Efficient Framework for Generating Storyline Visualizations from Streaming Data , 2015, IEEE Transactions on Visualization and Computer Graphics.

[25]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[26]  Russ Burtner,et al.  CyberPetri at CDX 2016: Real-time network situation awareness , 2016, 2016 IEEE Symposium on Visualization for Cyber Security (VizSec).

[27]  L. McMillan,et al.  A Fast Approximation to Multidimensional Scaling , 2006 .

[28]  P. Schönemann,et al.  Fitting one matrix to another under choice of a central dilation and a rigid motion , 1970 .

[29]  Daniel A. Keim,et al.  Visual Interaction with Dimensionality Reduction: A Structured Literature Analysis , 2017, IEEE Transactions on Visualization and Computer Graphics.

[30]  Helwig Hauser,et al.  Designing Progressive and Interactive Analytics Processes for High-Dimensional Data Analysis , 2017, IEEE Transactions on Visualization and Computer Graphics.

[31]  Jean-Daniel Fekete,et al.  GraphDiaries: Animated Transitions andTemporal Navigation for Dynamic Networks , 2014, IEEE Transactions on Visualization and Computer Graphics.

[32]  Alfred Inselberg,et al.  Parallel coordinates for visualizing multi-dimensional geometry , 1987 .

[33]  Thomas Ertl,et al.  ScatterBlogs2: Real-Time Monitoring of Microblog Messages through User-Guided Filtering , 2013, IEEE Transactions on Visualization and Computer Graphics.

[34]  W. Torgerson Multidimensional scaling: I. Theory and method , 1952 .

[35]  Ralph R. Martin,et al.  Incremental Eigenanalysis for Classification , 1998, BMVC.

[36]  Boudewijn P. F. Lelieveldt,et al.  CyteGuide: Visual Guidance for Hierarchical Single-Cell Analysis , 2018, IEEE Transactions on Visualization and Computer Graphics.

[37]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[38]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[39]  Guy Melançon,et al.  Multiscale hybrid MDS , 2004, Proceedings. Eighth International Conference on Information Visualisation, 2004. IV 2004..

[40]  Michael L. Littman,et al.  Online Linear Regression and Its Application to Model-Based Reinforcement Learning , 2007, NIPS.

[41]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[42]  David Gotz,et al.  Progressive Visual Analytics: User-Driven Visual Exploration of In-Progress Analytics , 2014, IEEE Transactions on Visualization and Computer Graphics.

[43]  Carsten Binnig,et al.  Progressive Data Science: Potential and Challenges , 2018, ArXiv.

[44]  Kwan-Liu Ma,et al.  An Incremental Layout Method for Visualizing Online Dynamic Graphs , 2015, J. Graph Algorithms Appl..

[45]  Julie Josse,et al.  Principal component analysis with missing values: a comparative survey of methods , 2015, Plant Ecology.

[46]  A. Inselberg,et al.  Parallel coordinates for visualizing multi-dimensional geometry , 1987 .

[47]  Jarke J. van Wijk,et al.  Reducing Snapshots to Points: A Visual Analytics Approach to Dynamic Network Exploration , 2016, IEEE Transactions on Visualization and Computer Graphics.

[48]  Yifan Hu,et al.  Interactive Visualization of Streaming Text Data with Dynamic Maps , 2013, J. Graph Algorithms Appl..

[49]  Aidong Lu,et al.  Discovery of rating fraud with real-time streaming visual analytics , 2015, 2015 IEEE Symposium on Visualization for Cyber Security (VizSec).

[50]  Marc Streit,et al.  Opening the Black Box: Strategies for Increased User Involvement in Existing Algorithm Implementations , 2014, IEEE Transactions on Visualization and Computer Graphics.

[51]  M. Akca GENERALIZED PROCRUSTES ANALYSIS AND ITS APPLICATIONS IN PHOTOGRAMMETRY , 2003 .

[52]  Sébastien Loisel,et al.  Comparisons among several methods for handling missing data in principal component analysis (PCA) , 2018, Advances in Data Analysis and Classification.

[53]  Daniel A. Keim,et al.  Visualization of streaming data: Observing change and context in information visualization techniques , 2013, 2013 IEEE International Conference on Big Data.

[54]  Valerio Pascucci,et al.  Visualizing High-Dimensional Data: Advances in the Past Decade , 2017, IEEE Transactions on Visualization and Computer Graphics.

[55]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[56]  Zoubin Ghahramani,et al.  Unifying linear dimensionality reduction , 2014, 1406.0873.

[57]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[58]  E. Anderson The Species Problem in Iris , 1936 .

[59]  Jarke J. van Wijk,et al.  Smooth and efficient zooming and panning , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[60]  Pierre Dragicevic,et al.  Time Curves: Folding Time to Visualize Patterns of Temporal Evolution in Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[61]  Daniel A. Keim,et al.  Visual Analysis of Time‐Series Similarities for Anomaly Detection in Sensor Networks , 2014, Comput. Graph. Forum.

[62]  William Ribarsky,et al.  Understanding Principal Component Analysis Using a Visual Analytics Tool , 2009 .

[63]  Jian Pei,et al.  Online Visual Analytics of Text Streams , 2015, IEEE Transactions on Visualization and Computer Graphics.

[64]  Michael Lindenbaum,et al.  Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[65]  Daniel A. Keim,et al.  Temporal MDS Plots for Analysis of Multivariate Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[66]  Yifan Hu,et al.  GMap: Drawing Graphs as Maps , 2009, GD.

[67]  Yaacov Ritov,et al.  Local procrustes for manifold embedding: a measure of embedding quality and embedding algorithms , 2009, Machine Learning.

[68]  Lyndsey Franklin,et al.  Human Factors in Streaming Data Analysis: Challenges and Opportunities for Information Visualization , 2017, Comput. Graph. Forum.

[69]  William Ribarsky,et al.  iPCA: An Interactive System for PCA‐based Visual Analytics , 2009, Comput. Graph. Forum.

[70]  Elmar Eisemann,et al.  Approximated and User Steerable tSNE for Progressive Visual Analytics , 2015, IEEE Transactions on Visualization and Computer Graphics.

[71]  Matti Pietikäinen,et al.  Incremental locally linear embedding , 2005, Pattern Recognit..

[72]  Paulo E. Rauber,et al.  Visualizing Time-Dependent Data Using Dynamic t-SNE , 2016, EuroVis.

[73]  Mengchen Liu,et al.  A survey on information visualization: recent advances and challenges , 2014, The Visual Computer.