Surveying the complementary role of automatic data analysis and visualization in knowledge discovery

The aim of this work is to survey and reflect on the various ways to integrate visualization and data mining techniques toward a mixed-initiative knowledge discovery taking the best of human and machine capabilities. Following a bottom-up bibliographic research approach, the article categorizes the observed techniques in classes, highlighting current trends, gaps, and potential future directions for research. In particular it looks at strengths and weaknesses of information visualization and data mining, and for which purposes researchers in infovis use data mining techniques and reversely how researchers in data mining employ infovis techniques. The article further uses this information to analyze the discovery process by comparing the analysis steps from the perspective of information visualization and data mining. The comparison permits to bring to light new perspectives on how mining and visualization can best employ human and machine skills.

[1]  Giuseppe Santucci,et al.  Give Chance a Chance: Modeling Density to Enhance Scatter Plot Quality through Random Data Sampling , 2006, Inf. Vis..

[2]  Daniel A. Keim,et al.  Visual Analytics: Scope and Challenges , 2008, Visual Data Mining.

[3]  P. Pirolli,et al.  The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis , 2007 .

[4]  Danah Boyd,et al.  Vizster: visualizing online social networks , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[5]  Hans-Peter Kriegel,et al.  Visual classification: an interactive approach to decision tree construction , 1999, KDD '99.

[6]  Jock D. Mackinlay,et al.  Automating the design of graphical presentations of relational information , 1986, TOGS.

[7]  M. Cooper,et al.  Revealing structure within clustered parallel coordinates displays , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[8]  Matthew O. Ward,et al.  Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering , 2004, IEEE Symposium on Information Visualization.

[9]  Jean-Daniel Fekete,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS , 2022 .

[10]  Jerry Alan Fails,et al.  Interactive machine learning , 2003, IUI '03.

[11]  John T. Stasko,et al.  Knowledge precepts for design and evaluation of information visualizations , 2005, IEEE Transactions on Visualization and Computer Graphics.

[12]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[13]  Matthew O. Ward,et al.  A Taxonomy of Glyph Placement Strategies for Multidimensional Data Visualization , 2002, Inf. Vis..

[14]  Alan J. Dix,et al.  Density control through random sampling: an architectural perspective , 2002, Proceedings Sixth International Conference on Information Visualisation.

[15]  Hans-Peter Kriegel,et al.  Towards an effective cooperation of the user and the computer for classification , 2000, KDD '00.

[16]  Kwan-Liu Ma,et al.  PaintingClass: interactive construction, visualization and exploration of decision trees , 2003, KDD '03.

[17]  Ian H. Witten,et al.  Interactive machine learning: letting users build classifiers , 2002, Int. J. Hum. Comput. Stud..

[18]  Ira Assent,et al.  Morpheus: interactive exploration of subspace clustering , 2008, KDD.

[19]  Matthew O. Ward,et al.  Measuring Data Abstraction Quality in Multiresolution Visualizations , 2006, IEEE Transactions on Visualization and Computer Graphics.

[20]  Matthew O. Ward,et al.  Managing discoveries in the visual analytics process , 2007, SKDD.