Interactive visual summarization of multidimensional data

Visualization has become integral to the knowledge discovery process across various domains. However, challenges remain in the effective use of visualization techniques, especially when displaying, exploring and analyzing large, multidimensional datasets, such as weather and meteorological data. Direct visualizations of such datasets tend to produce images that are cluttered with excess detail and are ineffective at communicating information at higher levels of abstraction. To address this problem we provide a visual summarization framework to intuitively reduce the data to its important and relevant characteristics. Summarization is performed in three broad steps. First, high-relevance data elements and clusters of similar data attributes are identified to reduce a dataset's size and dimensionality. Next, patterns, relationships and outliers are extracted from the reduced data. Finally, the extracted summary characteristics are visualized to the user. Such visualizations reduce excess visual detail and are more suited to the rapid comprehension of complex data. Users can interactively guide the summarization process gaining insight into both how and why the summary results are produced. Our framework improves the benefits of mathematical analysis and interactive visualization by combining the strengths of the computer and the user to generate high-quality summaries. Initial results from applying our framework to large weather datasets have been positive, suggesting that our approach could be beneficial for a wide range of domains and applications.

[1]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[2]  Steven K. Feiner,et al.  AutoVisual: rule-based design of interactive multivariate visualizations , 1993, IEEE Computer Graphics and Applications.

[3]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[4]  Matthew O. Ward,et al.  Value and Relation Display for Interactive Exploration of High Dimensional Datasets , 2004, IEEE Symposium on Information Visualization.

[5]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[6]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[7]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[8]  Eser Kandogan,et al.  Visualizing multi-dimensional clusters, trends, and outliers using star coordinates , 2001, KDD '01.

[9]  Michael Mitzenmacher,et al.  Interactive data summarization: an example application , 2004, AVI.

[10]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[11]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[12]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[13]  Yeuvo Jphonen,et al.  Self-Organizing Maps , 1995 .

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[16]  Myron Wish,et al.  Three-Way Multidimensional Scaling , 1978 .

[17]  Sarat Mohan Kocherlakota,et al.  Interactive Visual Summarization for Visualizing Large, Multidimensional Datasets , 2007 .

[18]  Ben Shneiderman,et al.  A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections , 2004 .

[19]  Xiaohua Hu,et al.  A Visualization Model of Interactive Knowledge Discovery Systems and Its Implementations , 2003, Inf. Vis..

[20]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[21]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[22]  Ben Shneiderman,et al.  A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections , 2004, IEEE Symposium on Information Visualization.

[23]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[24]  Christopher G. Healey,et al.  Assisted Visualization of E-Commerce Auction Agents , 2001, Graphics Interface.

[25]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[26]  Nick Cercone,et al.  RuleViz: a model for visualizing knowledge discovery process , 2000, KDD '00.