OutViz: Visualizing the Outliers of Multivariate Time Series

This paper proposes OutViz, a dual view framework for representing and filtering multivariate time series data to highlight abnormal patterns in a dataset. The first view of the proposed visualization incorporates a parallel coordinate chart that allows the user to analyze the scores of features extracted from a dimensionality reduction density-based clustering outlier detection algorithm to determine why a particular time series is predicted to be an outlier. Also included on the parallel coordinates chart is an outlier score rank axis that allows the user to select a range of time series data to be filtered and displayed on the second view of the framework. The second view of our proposed framework uses a multi-line chart to represent how each time series variable changes over a range of time. Each time series is represented as a line with the position on the horizontal axis representing a point in time, while the vertical axis encodes the data value. Use cases using real-world multivariate time series data are demonstrated to show the advantages of using the proposed framework for data analytics as well as some findings uncovered while using OutViz on life expectancy data from 236 countries between the year 1960 and 2018, and carbon dioxide emissions data from 210 countries between the year 1960 and 2016.

[1]  E. van Doorslaer,et al.  The rise and fall of mortality inequality in South Africa in the HIV era , 2018, SSM - population health.

[2]  Rattikorn Hewett,et al.  Congnostics: Visual Features for Doubly Time Series Plots , 2020, EuroVA@Eurographics/EuroVis.

[3]  Christian S. Jensen,et al.  Outlier Detection for Time Series with Recurrent Autoencoder Ensembles , 2019, IJCAI.

[4]  E. Wegman Hyperdimensional Data Analysis Using Parallel Coordinates , 1990 .

[5]  C. Munangagwa The Economic Decline of Zimbabwe , 2009 .

[6]  Till Bärnighausen,et al.  Increases in Adult Life Expectancy in Rural South Africa: Valuing the Scale-Up of HIV Treatment , 2013, Science.

[7]  Pavel Pudil,et al.  Novel Methods for Subset Selection with Respect to Problem Knowledge , 1998, IEEE Intell. Syst..

[8]  R. Grossman,et al.  Graph-theoretic scagnostics , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[9]  Alfred Inselberg,et al.  Parallel Coordinates: Visual Multidimensional Geometry and Its Applications , 2003, KDIR.

[10]  A. Hinton Why did they kill?: Cambodia in the shadow of genocide , 2004 .

[11]  Leland Wilkinson,et al.  TimeSeer: Scagnostics for High-Dimensional Time Series , 2013, IEEE Transactions on Visualization and Computer Graphics.

[12]  Jose A. Lozano,et al.  A Review on Outlier/Anomaly Detection in Time Series Data , 2020, ACM Comput. Surv..

[13]  P. Uvin Reading the Rwandan Genocide , 2001 .

[14]  Laurens van der Maaten,et al.  Accelerating t-SNE using tree-based algorithms , 2014, J. Mach. Learn. Res..

[15]  PudilPavel,et al.  Novel Methods for Subset Selection with Respect to Problem Knowledge , 1998 .

[16]  Susana Vinga,et al.  Outlier Detection for Multivariate Time Series Using Dynamic Bayesian Networks , 2021, Applied Sciences.

[17]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[18]  Patrick Riehmann,et al.  Time‐Series Plots Integrated in Parallel‐Coordinates Displays , 2016, Comput. Graph. Forum.

[19]  Alfred Inselberg,et al.  Convexity algorithms in parallel coordinates , 1987, JACM.

[20]  Yanbin Zhang,et al.  Enhancing effectiveness of density-based outlier mining scheme with density-similarity-neighbor-based outlier factor , 2010, Expert Syst. Appl..

[21]  L. Walkowicz,et al.  Density-based outlier scoring on Kepler data , 2020, 2003.00109.

[22]  Robert B. Ross,et al.  A visual analytics system for optimizing the performance of large-scale networks in supercomputing systems , 2018, Vis. Informatics.

[23]  Yubo Tao,et al.  Visual analytics of urban transportation from a bike-sharing and taxi perspective , 2019, Journal of Visualization.

[24]  Shawn Martin,et al.  Interactive Visualization of Multivariate Time Series Data , 2016, HCI.

[25]  Rob J. Hyndman,et al.  Large-Scale Unusual Time Series Detection , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[26]  Takanori Fujiwara,et al.  Comparative Visual Analytics for Assessing Medical Records with Sequence Embedding , 2020, Vis. Informatics.

[27]  Heidrun Schumann,et al.  Visualization of Time-Oriented Data , 2011, Human-Computer Interaction Series.

[28]  Ruey S. Tsay,et al.  Multivariate Time Series Analysis: With R and Financial Applications , 2013 .

[29]  A. Abuzaid Identifying density‐based local outliers in medical multivariate circular data , 2020, Statistics in medicine.

[30]  Valerio Pascucci,et al.  Visualizing High-Dimensional Data: Advances in the Past Decade , 2017, IEEE Transactions on Visualization and Computer Graphics.

[31]  Vladimir Batagelj,et al.  Data Science and Classification , 2006, Studies in Classification, Data Analysis, and Knowledge Organization.

[32]  Xiaozhe Wang,et al.  Characteristic-Based Clustering for Time Series Data , 2006, Data Mining and Knowledge Discovery.

[33]  Nick S. Jones,et al.  Highly Comparative Feature-Based Time-Series Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[34]  Kwan-Liu Ma,et al.  A Visual Analytics Framework for Reviewing Multivariate Time-Series Data with Dimensionality Reduction , 2021, IEEE Transactions on Visualization and Computer Graphics.

[35]  B. Kiernan The Demography of Genocide in Southeast Asia: The Death Tolls in Cambodia, 1975-79, and East Timor, 1975-80 , 2003 .

[36]  Yan-Fang Sang,et al.  Period identification in hydrologic time series using empirical mode decomposition and maximum entropy spectral analysis , 2012 .