Visualizing High Dimensional Classifier Performance Data

Classifier performance evaluation, which typically yields a vast number of results, may be approached as a problem of analyzing high dimensional data. Conducting an exploratory analysis of visual representations of this evaluation data enables us to exploit the advantages of the powerful human visual capabilities. This allows us to gain insight into the performance data, interact with it and draw meaningful conclusions about the classifiers and domains under study. We illustrate how visual techniques, based on a projection from a high dimensional space to a lower dimensional one, enable such an exploratory process. Moreover, this approach can be viewed as a generalization of conventional evaluation procedures based on point metrics that necessarily imply a higher loss of information. Finally, we show that within this framework, the user is able to study the evaluation data from a classifier point of view and from a domain point of view, which is infeasible with traditional evaluation methods.

[1]  Li Yang,et al.  Distance-Preserving Projection of High-Dimensional Data for Nonlinear Dimensionality Reduction , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Haim Levkowitz,et al.  From Visual Data Exploration to Visual Data Mining: A Survey , 2003, IEEE Trans. Vis. Comput. Graph..

[3]  Robert C. Holte,et al.  Cost curves: An improved method for visualizing classifier performance , 2006, Machine Learning.

[4]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[5]  Rich Caruana,et al.  Data mining in metric space: an empirical analysis of supervised learning performance criteria , 2004, ROCAI.

[6]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[7]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[8]  Nathalie Japkowicz,et al.  Visualizing Classifier Performance on Different Domains , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[9]  Andreas Buja,et al.  XGobi: Interactive Dynamic Data Visualization in the X Window System , 1998 .

[10]  I K Fodor,et al.  A Survey of Dimension Reduction Techniques , 2002 .

[11]  Lasse Holmström Nonlinear Dimensionality Reduction by John A. Lee, Michel Verleysen , 2008 .

[12]  Andreas Buja,et al.  Interactive High-Dimensional Data Visualization , 1996 .

[13]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[14]  Tom Fawcett,et al.  Robust Classification Systems for Imprecise Environments , 1998, AAAI/IAAI.

[15]  Richard C. T. Lee,et al.  A Triangulation Method for the Sequential Mapping of Points from N-Space to Two-Space , 1977, IEEE Transactions on Computers.

[16]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[17]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[18]  Ian Davidson,et al.  Visual Data Mining: Techniques and Tools for Data Visualization and Mining , 2002 .