Data Transformations and Representations for Computation and Visualization

At the core of successful visual analytics systems are computational techniques that transform data into concise, human comprehensible visual representations. The general process often requires multiple transformation steps before a final visual representation is generated. This article characterizes the complex raw data to be analyzed and then describes two different sets of transformations and representations. The first set transforms the raw data into more concise representations that improve the performance of sophisticated computational methods. The second transforms internal representations into visual representations that provide the most benefit to an interactive user. The end result is a computing system that enhances an end user's analytic process with effective visual representations and interactive techniques. While progress has been made on improving data transformations and representations, there is substantial room for improvement.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  Gene H. Golub,et al.  Missing value estimation for DNA microarray gene expression data: local least squares imputation , 2005, Bioinform..

[3]  Srinivasan Parthasarathy,et al.  A visual-analytic toolkit for dynamic interaction graphs , 2008, KDD.

[4]  Dan Roth,et al.  Learning in Natural Language , 1999, IJCAI.

[5]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[6]  Michelle X. Zhou,et al.  Context-Aware, adaptive information retrieval for investigative tasks , 2007, IUI '07.

[7]  Pat Hanrahan,et al.  Designing effective step-by-step assembly instructions , 2003, ACM Trans. Graph..

[8]  Jaegul Choo,et al.  Two-stage framework for visualization of clustered high dimensional data , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[9]  A. Agresti An introduction to categorical data analysis , 1997 .

[10]  Ben Shneiderman,et al.  Readings in information visualization - using vision to think , 1999 .

[11]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[12]  Nicholas Chen,et al.  TreeJuxtaposer : Scalable Tree Comparison using Focus + Context with Guaranteed Visibility , 2006 .

[13]  David S. Ebert,et al.  Illustration and photography inspired visualization of flows and volumes , 2005, VIS 05. IEEE Visualization, 2005..

[14]  David S. Ebert,et al.  Illustration-Inspired Depth Enhanced Volumetric Medical Visualization , 2009, IEEE Transactions on Visualization and Computer Graphics.

[15]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[16]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[17]  Donald A. Norman,et al.  Things That Make Us Smart: Defending Human Attributes In The Age Of The Machine , 1993 .

[18]  Stuart K. Card,et al.  Entity-based collaboration tools for intelligence analysis , 2008, 2008 IEEE Symposium on Visual Analytics Science and Technology.

[19]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[20]  Ana de Almeida,et al.  Nonnegative Matrix Factorization , 2018 .

[21]  Gene H. Golub,et al.  Matrix computations , 1983 .

[22]  Haesun Park,et al.  Generalizing discriminant analysis using the generalized singular value decomposition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Gennady L. Andrienko,et al.  Interactive visual interfaces for evacuation planning , 2008, AVI '08.

[24]  William M. Pottenger,et al.  Leveraging Higher Order Dependencies Between Features for Text Classification , 2009 .

[25]  David S. Ebert,et al.  Abstractive Representation and Exploration of Hierarchically Clustered Diffusion Tensor Fiber Tracts , 2008, Comput. Graph. Forum.

[26]  Pat Hanrahan,et al.  Show Me: Automatic Presentation for Visual Analysis , 2007, IEEE Transactions on Visualization and Computer Graphics.

[27]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[28]  V. Vianu,et al.  Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .

[29]  Michael I. Jordan,et al.  Probabilistic Networks and Expert Systems , 1999 .

[30]  Qi Li,et al.  How Up-to-date should it be? the Value of Instant Profiling and Adaptation in Information Filtering , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[31]  Weng-Keen Wong,et al.  Integrating rich user feedback into intelligent user interfaces , 2008, IUI '08.

[32]  Alexander W. Skaburskis,et al.  The Sandbox for analysis: concepts and methods , 2006, CHI.

[33]  Dieter W. Fellner,et al.  Trajectory-based visual analysis of large financial time series data , 2007, SKDD.

[34]  Jarke J. van Wijk,et al.  Supporting the analytical reasoning process in information visualization , 2008, CHI.

[35]  David S. Ebert,et al.  Scale and Complexity in Visual Analytics , 2009, Inf. Vis..

[36]  Michelle X. Zhou,et al.  An optimization-based approach to dynamic data transformation for smart visualization , 2008, IUI '08.

[37]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[38]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[39]  William M. Pottenger,et al.  Detection of Interdomain Routing Anomalies Based on Higher-Order Path Analysis , 2006, Sixth International Conference on Data Mining (ICDM'06).

[40]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[41]  Data Representations and Transformations What Are Data Representations and Transformations? , 2022 .

[42]  Hyunsoo Kim,et al.  Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method , 2008, SIAM J. Matrix Anal. Appl..

[43]  Michelle X. Zhou Visual Planning: A Practical Approach to Automated Presentation Design , 1999, IJCAI.

[44]  Steven F. Roth,et al.  Automating the presentation of information , 1991, [1991] Proceedings. The Seventh IEEE Conference on Artificial Intelligence Application.

[45]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[46]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[47]  Gene H. Golub,et al.  Missing Value Estimation for DNA Microarray Expression Data : Least Squares Imputation , 2004 .