Outliagnostics: Visualizing Temporal Discrepancy in Outlying Signatures of Data Entries

This paper presents an approach to analyzing two-dimensional temporal datasets focusing on identifying observations that are significant in calculating the outliers of a scatterplot. We also propose a prototype, called Outliagnostics, to guide users when interactively exploring abnormalities in large time series. Instead of focusing on detecting outliers at each time point, we monitor and display the discrepant temporal signatures of each data entry concerning the overall distributions. Our prototype is designed to handle these tasks in parallel to improve performance. To highlight the benefits and performance of our approach, we illustrate and validate the use of Outliagnostics on real-world datasets of various sizes in different parallelism configurations. This work also discusses how to extend these ideas to handle time series with a higher number of dimensions and provides a prototype for this type of datasets.

[1]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[2]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[3]  Yu-Ru Lin,et al.  Voila: Visual Anomaly Detection and Monitoring with Streaming Spatiotemporal Data , 2018, IEEE Transactions on Visualization and Computer Graphics.

[4]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[5]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[6]  H. Beyer Tukey, John W.: Exploratory Data Analysis. Addison‐Wesley Publishing Company Reading, Mass. — Menlo Park, Cal., London, Amsterdam, Don Mills, Ontario, Sydney 1977, XVI, 688 S. , 1981 .

[7]  Vung Pham,et al.  SOAViz: Visualization for Portable X-ray Fluorescence Soil Profiles , 2019, EnvirVis@EuroVis.

[8]  A. Madansky Identification of Outliers , 1988 .

[9]  Wei Chen,et al.  A survey of network anomaly visualization , 2017, Science China Information Sciences.

[10]  R. Grossman,et al.  Graph-theoretic scagnostics , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[11]  J. Simonoff,et al.  Procedures for the Identification of Multiple Outliers in Linear Models , 1993 .

[12]  E. Acuña,et al.  A Meta analysis study of outlier detection methods in classification , 2004 .

[13]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[14]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[15]  Robert W. Hayden A Dataset that is 44% Outliers , 2005 .

[16]  Matt P. Wand,et al.  On the Accuracy of Binned Kernel Density Estimators , 1994 .

[17]  Ching-Yung Lin,et al.  TargetVue: Visual Analysis of Anomalous User Behaviors in Online Communication Systems , 2016, IEEE Transactions on Visualization and Computer Graphics.

[18]  Leland Wilkinson,et al.  Visualizing Big Data Outliers Through Distributed Aggregation , 2018, IEEE Transactions on Visualization and Computer Graphics.

[19]  Chang-Tien Lu,et al.  Outlier Detection , 2008, Encyclopedia of GIS.

[20]  Shian-Shyong Tseng,et al.  Two-phase clustering process for outliers detection , 2001, Pattern Recognit. Lett..

[21]  Tommy Dang,et al.  MTDES: Multi-dimensional Temporal Data Exploration System; Strong Support for Exploratory Analysis Award in VAST 2018, Mini-Challenge 2 , 2018, 2018 IEEE Conference on Visual Analytics Science and Technology (VAST).

[22]  Yun Wang,et al.  EnsembleLens: Ensemble-based Visual Exploration of Anomaly Detection Algorithms with Multidimensional Data , 2019, IEEE Transactions on Visualization and Computer Graphics.

[23]  Sukumar Nandi,et al.  An Outlier Detection Method Based on Clustering , 2011, 2011 Second International Conference on Emerging Applications of Information Technology.

[24]  P. Prescott,et al.  On Rohlf's Method for the Detection of Outliers in Multivariate Data , 1995 .

[25]  Michael Pokojovy,et al.  A Cluster-Based Outlier Detection Scheme for Multivariate Data , 2015 .

[26]  Vung Pham,et al.  CVExplorer: Multidimensional Visualization for Common Vulnerabilities and Exposures , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[27]  Tommy Dang,et al.  TimeMatrix: Visual Representation for Temporal Pattern Detection in Dynamic Networks, VAST 2018 Mini-Challenge 3 , 2018, 2018 IEEE Conference on Visual Analytics Science and Technology (VAST).

[28]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[29]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[30]  Akihiro Yamamoto,et al.  Outlier Detection Based on Leave-One-Out Density Using Binary Decision Diagrams , 2014, PAKDD.

[31]  F. James Rohlf,et al.  Generalization of the Gap Test for the Detection of Multivariate Outliers , 1975 .

[32]  Teri A. Crosby,et al.  How to Detect and Handle Outliers , 1993 .

[33]  Boris Iglewicz,et al.  Outlier detection using robust measures of scale , 1982 .

[34]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[35]  Marta Indulska,et al.  Open data: Quality over quantity , 2017, Int. J. Inf. Manag..

[36]  Leland Wilkinson,et al.  An L-infinity Norm Visual Classifier , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[37]  Ching-Wei Chang,et al.  An Iterative Leave-One-Out Approach to Outlier Detection in RNA-Seq Data , 2015, PloS one.

[38]  Dongyi Ye,et al.  Minimum Spanning Tree Based Spatial Outlier Mining and Its Applications , 2008, RSKT.