Visual exploration of classification models for risk assessment

In risk assessment applications well informed decisions are made based on huge amounts of multi-dimensional data. In many domains not only the risk of a wrong decision, but in particular the trade-off between the costs of possible decisions are of utmost importance. In this paper we describe a framework tightly integrating interactive visual exploration with machine learning to support the decision making process. The proposed approach uses a series of interactive 2D visualizations of numeric and ordinal data combined with visualization of classification models. These series of visual elements are further linked to the classifier's performance visualized using an interactive performance curve. An interactive decision point on the performance curve allows the decision maker to steer the classification model and instantly identify the critical, cost changing data elements, in the various linked visualizations. The critical data elements are represented as images in order to trigger associations related to the knowledge of the expert. In this context the data visualization and classification results are not only linked together, but are also linked back to the classification model. Such a visual analytics framework allows the user to interactively explore the costs of his decisions for different settings of the model and accordingly use the most suitable classification model and make more informed and reliable decisions. A case study on data from the Forensic Psychiatry domain reveals the usefulness of the suggested approach.

[1]  Duncan Temple Lang,et al.  GGobi: evolving from XGobi into an extensible framework for interactive data visualization , 2003, Comput. Stat. Data Anal..

[2]  Vasant Honavar,et al.  Gaining insights into support vector machine pattern classifiers using projection-based tour methods , 2001, KDD '01.

[3]  Pierre Dragicevic,et al.  Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation , 2008, IEEE Transactions on Visualization and Computer Graphics.

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[6]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[7]  M. E. McGill,et al.  Dynamic Graphics for Statistics , 1988 .

[8]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[9]  M. Sheelagh T. Carpendale,et al.  VisLink: Revealing Relationships Amongst Visualizations , 2007, IEEE Transactions on Visualization and Computer Graphics.

[10]  François Poulet Towards Effective Visual Data Mining with Cooperative Approaches , 2008, Visual Data Mining.

[11]  Heike Hofmann,et al.  Interactive Graphics for Data Sets with Missing Values—MANET , 1996 .

[12]  Marcus A. Magnor,et al.  Combining automated analysis and visualization techniques for effective exploration of high-dimensional data , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[13]  R. Kosara,et al.  Parallel sets: visual analysis of categorical data , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[14]  Raphael Fuchs,et al.  Visual Human+Machine Learning , 2009, IEEE Transactions on Visualization and Computer Graphics.

[15]  Martin Theus,et al.  Visualizing Loglinear Models , 1999 .

[16]  Hans-Peter Kriegel,et al.  Towards an effective cooperation of the user and the computer for classification , 2000, KDD '00.

[17]  Chen Yu,et al.  Visual Data Mining of Multimedia Data for Social and Behavioral Studies , 2009, Inf. Vis..

[18]  John T. Stasko,et al.  Toward a Deeper Understanding of the Role of Interaction in Information Visualization , 2007, IEEE Transactions on Visualization and Computer Graphics.

[19]  Lutz Hamel,et al.  Visualization of Support Vector Machines with Unsupervised Learning , 2006, 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology.

[20]  Simon Urbanek,et al.  Interactive graphics for Data Analysis - Principles and Examples , 2008, Computer science and data analysis series.

[21]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[22]  Jeffrey Heer,et al.  A tour through the visualization zoo , 2010, ACM Queue.

[23]  Pat Hanrahan,et al.  Polaris: a system for query, analysis and visualization of multi-dimensional relational databases , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[24]  Changzhou Wang,et al.  DataJewel: Tightly Integrating Visualization with Temporal Data Mining , 2003 .

[25]  Daniel A. Keim,et al.  Visual Analytics: Scope and Challenges , 2008, Visual Data Mining.

[26]  Matthew O. Ward,et al.  XmdvTool: integrating multiple methods for visualizing multivariate data , 1994, Proceedings Visualization '94.

[27]  Stefan Berchtold,et al.  Similarity clustering of dimensions for an enhanced visualization of multidimensional data , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).