Xtreaming: an incremental multidimensional projection technique and its application to streaming data

Streaming data applications are becoming more common due to the ability of different information sources to continuously capture or produce data, such as sensors and social media. Despite recent advances, most visualization approaches, in particular, multidimensional projection or dimensionality reduction techniques, cannot be directly applied in such scenarios due to the transient nature of streaming data. Currently, only a few methods address this limitation using online or incremental strategies, continuously processing data, and updating the visualization. Despite their relative success, most of them impose the need for storing and accessing the data multiple times, not being appropriate for streaming where data continuously grow. Others do not impose such requirements but are not capable of updating the position of the data already projected, potentially resulting in visual artifacts. In this paper, we present Xtreaming, a novel incremental projection technique that continuously updates the visual representation to reflect new emerging structures or patterns without visiting the multidimensional data more than once. Our tests show that Xtreaming is competitive in terms of global distance preservation if compared to other streaming and incremental techniques, but it is orders of magnitude faster. To the best of our knowledge, it is the first methodology that is capable of evolving a projection to faithfully represent new emerging structures without the need to store all data, providing reliable results for efficiently and effectively projecting streaming data.

[1]  Fernando Vieira Paulovich,et al.  UPDis: A user-assisted projection technique for distance information , 2018, Inf. Vis..

[2]  Moshe Azar,et al.  Argumentative Text as Rhetorical Structure: An Application of Rhetorical Structure Theory , 1999 .

[3]  Daniel W. Archambault,et al.  Mental Map Preservation Helps User Orientation in Dynamic Graphs , 2012, GD.

[4]  Ulrik Brandes,et al.  Eigensolver Methods for Progressive Multidimensional Scaling of Large Data , 2006, GD.

[5]  Marc Olano,et al.  Glimmer: Multilevel MDS on the GPU , 2009, IEEE Transactions on Visualization and Computer Graphics.

[6]  Luis Gustavo Nonato,et al.  Uncovering Representative Groups in Multidimensional Projections , 2015, Comput. Graph. Forum.

[7]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[8]  Stephan Diehl,et al.  Graphs, They Are Changing , 2002, GD.

[9]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[10]  Michael Lindenbaum,et al.  Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[11]  Declan Butler,et al.  2020 computing: Everything, everywhere , 2006, Nature.

[12]  Klaus Diepold,et al.  Truly Incremental Locally Linear Embedding , 2008 .

[13]  Luis Gustavo Nonato,et al.  Local Affine Multidimensional Projection , 2011, IEEE Transactions on Visualization and Computer Graphics.

[14]  Xiangliang Zhang,et al.  Processing of massive audit data streams for real-time anomaly intrusion detection , 2008, Comput. Commun..

[15]  Yutaka Matsuo,et al.  Tweet Analysis for Real-Time Event Detection and Earthquake Reporting System Development , 2013, IEEE Transactions on Knowledge and Data Engineering.

[16]  Ye Zhao,et al.  STREAMIT: Dynamic visualization and interactive exploration of text streams , 2011, 2011 IEEE Pacific Visualization Symposium.

[17]  Gerald L. Engel,et al.  VISUALIZATION AND COMPUTER GRAPHICS , 2005 .

[18]  Anil K. Jain,et al.  Nonlinear Manifold Learning for Data Stream , 2004, SDM.

[19]  Cláudio T. Silva,et al.  Two-Phase Mapping for Projecting Massive Data Sets , 2010, IEEE Transactions on Visualization and Computer Graphics.

[20]  Saif Mohammad,et al.  Stance and Sentiment in Tweets , 2016, ACM Trans. Internet Techn..

[21]  Varun Chandola,et al.  S-Isomap++: Multi manifold learning from streaming data , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[22]  Joshua B. Tenenbaum,et al.  Sparse multidimensional scaling using land-mark points , 2004 .

[23]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[24]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[25]  João Gama,et al.  A survey on learning from data streams: current and future trends , 2012, Progress in Artificial Intelligence.

[26]  Feng Qi,et al.  Real-Time Environmental Monitoring and Notification for Public Safety , 2010, IEEE MultiMedia.

[27]  Luis Gustavo Nonato,et al.  Multidimensional Projection for Visual Analytics: Linking Techniques with Distortions, Tasks, and Layout Enrichment , 2019, IEEE Transactions on Visualization and Computer Graphics.

[28]  Li Zhang,et al.  Locally Linear Embedding algorithm based on OMP for incremental learning , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[29]  Hui Xiong,et al.  IDR/QR: an incremental dimension reduction algorithm via QR decomposition , 2004, IEEE Transactions on Knowledge and Data Engineering.

[30]  Stephan Diehl,et al.  Preserving the Mental Map using Foresighted Layout , 2001, VisSym.

[31]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[32]  Joshua B. Tenenbaum,et al.  Global Versus Local Methods in Nonlinear Dimensionality Reduction , 2002, NIPS.

[33]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[34]  Fernando Vieira Paulovich,et al.  LoCH: A neighborhood-based multidimensional projection technique for high-dimensional sparse spaces , 2015, Neurocomputing.

[35]  Matti Pietikäinen,et al.  Incremental locally linear embedding , 2005, Pattern Recognit..

[36]  W. Torgerson,et al.  Multidimensional scaling of similarity , 1965, Psychometrika.

[37]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[38]  Kwan-Liu Ma,et al.  An Incremental Dimensionality Reduction Method for Visualizing Streaming Multidimensional Data , 2019, IEEE Transactions on Visualization and Computer Graphics.

[39]  Aleksandar Lazarevic,et al.  Incremental Local Outlier Detection for Data Streams , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.

[40]  Haim Levkowitz,et al.  Least Square Projection: A Fast High-Precision Multidimensional Projection Technique and Its Application to Document Mapping , 2008, IEEE Transactions on Visualization and Computer Graphics.

[41]  Wojciech Basalaj,et al.  Incremental multidimensional scaling method for database visualization , 1999, Electronic Imaging.

[42]  Hua Li,et al.  A scalable supervised algorithm for dimensionality reduction on streaming data , 2006, Inf. Sci..

[43]  Anil K. Jain,et al.  Incremental nonlinear dimensionality reduction by manifold learning , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Saif M. Mohammad,et al.  Sentiment Analysis: Detecting Valence, Emotions, and Other Affectual States from Text , 2016, ArXiv.

[45]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[46]  Maja J. Mataric,et al.  A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.

[47]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[48]  Charl P. Botha,et al.  Piece wise Laplacian‐based Projection for Interactive Data Exploration and Organization , 2011, Comput. Graph. Forum.