A Visual Analytics Approach to Monitor Time-Series Data with Incremental and Progressive Functional Data Analysis

Many real-world applications involve analyzing time-dependent phenomena, which are intrinsically functional---consisting of curves varying over a continuum, which is time in this case. When analyzing continuous data, functional data analysis (FDA) provides substantial benefits, such as the ability to study the derivatives and to restrict the ordering of data. However, continuous data inherently has infinite dimensions, and FDA methods often suffer from high computational costs. This is even more critical when we have new incoming data and want to update the FDA results in real-time. In this paper, we present a visual analytics approach to consecutively monitor and review the changing time-series data with a focus on identifying outliers by using FDA. To perform such an analysis while addressing the computational problem, we introduce new incremental and progressive algorithms that promptly generate the magnitude-shape (MS) plot, which reveals both the functional magnitude and shape outlyingness of time-series data. In addition, by using an MS plot in conjunction with an FDA version of principal component analysis, we enhance the analyst's ability to investigate the visually-identified outliers. We illustrate the effectiveness of our approach with three case studies using real-world and synthetic datasets.

[1]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[2]  K. J. Utikal,et al.  Inference for Density Families Using Functional Principal Component Analysis , 2001 .

[3]  Rob J Hyndman,et al.  Rainbow Plots, Bagplots, and Boxplots for Functional Data , 2010 .

[4]  H. Shang A survey of functional principal component analysis , 2014 .

[5]  Daniel A. Keim,et al.  Visualization of streaming data: Observing change and context in information visualization techniques , 2013, 2013 IEEE International Conference on Big Data.

[6]  Peter Hall,et al.  Theory for high-order bounds in functional principal components analysis , 2009, Mathematical Proceedings of the Cambridge Philosophical Society.

[7]  Franck Cappello,et al.  La VALSE: Scalable Log Visualization for Fault Characterization in Supercomputers , 2018, EGPGV@EuroVis.

[8]  Lyndsey Franklin,et al.  Human Factors in Streaming Data Analysis: Challenges and Opportunities for Information Visualization , 2017, Comput. Graph. Forum.

[9]  Elmar Eisemann,et al.  Approximated and User Steerable tSNE for Progressive Visual Analytics , 2015, IEEE Transactions on Visualization and Computer Graphics.

[10]  Hans-Georg Müller Functional Data Analysis. , 2011 .

[11]  P. Vieu,et al.  Nonparametric Functional Data Analysis: Theory and Practice (Springer Series in Statistics) , 2006 .

[12]  C. R. Rao,et al.  Some statistical methods for comparison of growth curves. , 1958 .

[13]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[14]  Chetan Gupta,et al.  Remaining Useful Life Estimation Using Functional Data Analysis , 2019, 2019 IEEE International Conference on Prognostics and Health Management (ICPHM).

[15]  Fernando V. Paulovich,et al.  Xtreaming: an incremental multidimensional projection technique and its application to streaming data , 2020, ArXiv.

[16]  Rob J. Hyndman,et al.  Robust forecasting of mortality and fertility rates: A functional data approach , 2007, Comput. Stat. Data Anal..

[17]  Kwan-Liu Ma,et al.  An Incremental Layout Method for Visualizing Online Dynamic Graphs , 2015, J. Graph Algorithms Appl..

[18]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Christian Engelmann,et al.  Big Data Meets HPC Log Analytics: Scalable Approach to Understanding Systems at Extreme Scale , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[20]  Roberto Viviani,et al.  Functional principal component analysis of fMRI data , 2005, Human brain mapping.

[21]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[22]  Fumiyoshi Shoji,et al.  Overview of the K computer System , 2012 .

[23]  Carsten Binnig,et al.  Progressive Data Science: Potential and Challenges , 2018, ArXiv.

[24]  J. Ramsay,et al.  Some Tools for Functional Data Analysis , 1991 .

[25]  Magdy A. Bayoumi,et al.  VAStream: A Visual Analytics System for Fast Data Streams , 2019, PEARC.

[26]  Marc G. Genton,et al.  Multivariate Functional Data Visualization and Outlier Detection , 2017, Journal of Computational and Graphical Statistics.

[27]  Kwan-Liu Ma,et al.  An Incremental Dimensionality Reduction Method for Visualizing Streaming Multidimensional Data , 2019, IEEE Transactions on Visualization and Computer Graphics.

[28]  Kwan-Liu Ma,et al.  MELA: A Visual Analytics Tool for Studying Multifidelity HPC System Logs , 2019, 2019 IEEE/ACM Industry/University Joint International Workshop on Data-center Automation, Analytics, and Control (DAAC).

[29]  Xuming He,et al.  On the Stahel-Donoho estimator and depth-weighted means of multivariate data , 2003 .

[30]  Marc G. Genton,et al.  Directional outlyingness for multivariate functional data , 2016, Comput. Stat. Data Anal..

[31]  Kwan-Liu Ma,et al.  An Efficient Framework for Generating Storyline Visualizations from Streaming Data , 2015, IEEE Transactions on Visualization and Computer Graphics.

[32]  Christian Engelmann,et al.  A Big Data Analytics Framework for HPC Log Data: Three Case Studies Using the Titan Supercomputer Log , 2018, 2018 IEEE International Conference on Cluster Computing (CLUSTER).

[33]  K. Karhunen Zur Spektraltheorie stochastischer prozesse , 1946 .

[34]  Piotr Kokoszka,et al.  Inference for Functional Data with Applications , 2012 .

[35]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[36]  Jian Pei,et al.  Online Visual Analytics of Text Streams , 2015, IEEE Transactions on Visualization and Computer Graphics.

[37]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[38]  Robert B. Ross,et al.  A Visual Analytics Framework for Reviewing Streaming Performance Data , 2020, 2020 IEEE Pacific Visualization Symposium (PacificVis).

[39]  Helwig Hauser,et al.  Designing Progressive and Interactive Analytics Processes for High-Dimensional Data Analysis , 2017, IEEE Transactions on Visualization and Computer Graphics.

[40]  Irene Epifanio,et al.  Detection of Anomalies in Water Networks by Functional Data Analysis , 2018, Mathematical Problems in Engineering.

[41]  Ricardo Fraiman,et al.  Recent advances in functional data analysis and high-dimensional statistics , 2019, J. Multivar. Anal..