Understanding a Sequence of Sequences: Visual Exploration of Categorical States in Lake Sediment Cores

This design study focuses on the analysis of a time sequence of categorical sequences. Such data is relevant for the geoscientific research field of landscape and climate development. It results from microscopic analysis of lake sediment cores. The goal is to gain hypotheses about landscape evolution and climate conditions in the past. To this end, geoscientists identify which categorical sequences are similar in the sense that they indicate similar conditions. Categorical sequences are similar if they have similar meaning (semantic similarity) and appear in similar time periods (temporal similarity). For data sets with many different categorical sequences, the task to identify similar sequences becomes a challenge. Our contribution is a tailored visual analysis concept that effectively supports the analytical process. Our visual interface comprises coupled visualizations of semantics and temporal context for the exploration and assessment of the similarity of categorical sequences. Integrated automatic methods reduce the analytical effort substantially. They (1) extract unique sequences in the data and (2) rank sequences by a similarity measure during the search for similar sequences. We evaluated our concept by demonstrations of our prototype to a larger audience and hands-on analysis sessions for two different lakes. According to geoscientists, our approach fills an important methodological gap in the application domain.

[1]  A. Brauer,et al.  The potential of varves in high-resolution paleolimnological studies , 2009 .

[2]  Wei Wang,et al.  Finding High-Order Correlations in High-Dimensional Biological Data , 2010, Link Mining.

[3]  Tamara Munzner,et al.  A Taxonomy of Visual Cluster Separation Factors , 2012, Comput. Graph. Forum.

[4]  Rosane Minghim,et al.  Improved Similarity Trees and their Application to Visual Data Classification , 2011, IEEE Transactions on Visualization and Computer Graphics.

[5]  Rita Borgo,et al.  TimeNotes: A Study on Effective Chart Visualization and Interaction Techniques for Time-Series Data , 2016, IEEE Transactions on Visualization and Computer Graphics.

[6]  Ben Shneiderman,et al.  Coping with Volume and Variety in Temporal Event Sequences: Strategies for Sharpening Analytic Focus , 2017, IEEE Transactions on Visualization and Computer Graphics.

[7]  Matthew D. Cooper,et al.  ActiviTree: Interactive Visual Exploration of Sequences in Event-Based Data Using Graph Similarity , 2009, IEEE Transactions on Visualization and Computer Graphics.

[8]  Ben Shneiderman,et al.  Temporal Event Sequence Simplification , 2013, IEEE Transactions on Visualization and Computer Graphics.

[9]  Charu C. Aggarwal,et al.  Data Mining: The Textbook , 2015 .

[10]  Ben Shneiderman,et al.  Finding comparable temporal categorical records: A similarity measure with an interactive visualization , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[11]  Tiziana Catarci,et al.  Visualization of linear time-oriented data: a survey , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[12]  Cynthia A. Brewer,et al.  ColorBrewer in Print: A Catalog of Color Schemes for Maps , 2003 .

[13]  Philip S. Yu,et al.  Mining Asynchronous Periodic Patterns in Time Series Data , 2003, IEEE Trans. Knowl. Data Eng..

[14]  Anders Ynnerman,et al.  Are we what we do? Exploring group behaviour through user-defined event-sequence similarity , 2014, Inf. Vis..

[15]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[16]  Daniel A. Keim,et al.  Pixel bar charts: a visualization technique for very large multi-attribute data sets? , 2002, Inf. Vis..

[17]  M. Levandowsky,et al.  Distance between Sets , 1971, Nature.

[18]  Niklas Elmqvist,et al.  Stack Zooming for Multifocus Interaction in Skewed-Aspect Visual Spaces , 2013, IEEE Transactions on Visualization and Computer Graphics.

[19]  Heike Leitte,et al.  Exploring and Comparing Clusterings of Multivariate Data Sets Using Persistent Homology , 2016, Comput. Graph. Forum.

[20]  Robert S. Laramee,et al.  TimeClassifier: a visual analytic system for the classification of multi-dimensional time series data , 2015, The Visual Computer.

[21]  Ben Shneiderman,et al.  LifeLines: using visualization to enhance navigation and analysis of patient records , 1998, AMIA.

[22]  Wei Luo,et al.  Visualizing the Impact of Geographical Variations on Multivariate Clustering , 2016, Comput. Graph. Forum.

[23]  Achim Brauer,et al.  Regional atmospheric circulation shifts induced by a grand solar minimum , 2012 .

[24]  W. Tinner,et al.  Synchronous Holocene climatic oscillations recorded on the Swiss Plateau and at timberline in the Alps , 1998 .

[25]  Walid G. Aref,et al.  Periodicity detection in time series databases , 2005, IEEE Transactions on Knowledge and Data Engineering.

[26]  A. Witt,et al.  Palaeoclimatic implications from micro-facies data of a 5900 varve time series from the Piànico interglacial sediment record, southern Alps , 2008 .

[27]  Ben Shneiderman,et al.  LifeFlow: visualizing an overview of event sequences , 2011, CHI.

[28]  K. Gabriel,et al.  The biplot graphic display of matrices with application to principal component analysis , 1971 .

[29]  Eser Kandogan Star Coordinates: A Multi-dimensional Visualization Technique with Uniform Treatment of Dimensions , 2000 .

[30]  Heidrun Schumann,et al.  Visualization of Time-Oriented Data , 2011, Human-Computer Interaction Series.

[31]  Georges G. Grinstein,et al.  DNA visual and analytic data mining , 1997 .

[32]  J. Overpeck,et al.  The time-transgressive termination of the African Humid Period , 2015 .

[33]  Matthias Studer,et al.  Spell Sequences, State Proximities, and Distance Metrics , 2015 .

[34]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[35]  Michael Gleicher,et al.  Sequence Surveyor: Leveraging Overview for Scalable Genomic Alignment Visualization , 2011, IEEE Transactions on Visualization and Computer Graphics.

[36]  A. Brauer,et al.  Varve microfacies and varve preservation record of climate change and human impact for the last 6000 years at Lake Tiefer See (NE Germany) , 2017 .

[37]  Ben Shneiderman,et al.  Cohort Comparison of Event Sequences with Balanced Integration of Visual Analytics and Statistics , 2015, IUI.

[38]  Dirk J. Lehmann,et al.  Orthographic Star Coordinates , 2013, IEEE Transactions on Visualization and Computer Graphics.

[39]  Eamonn J. Keogh Nearest Neighbor , 2010, Encyclopedia of Machine Learning.

[40]  James T. Enns,et al.  Attention and Visual Memory in Visualization and Computer Graphics , 2012, IEEE Transactions on Visualization and Computer Graphics.

[41]  Gilbert Ritschard,et al.  Analyzing and Visualizing State Sequences in R with TraMineR , 2011 .

[42]  Daniel A. Keim,et al.  Multi-Resolution Techniques for Visual Exploration of Large Time-Series Data , 2007, EuroVis.

[43]  A. Brauer Annually Laminated Lake Sediments and Their Palaeoclimatic Relevance , 2004 .

[44]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .