Class visualization of high-dimensional data with applications

The problem of visualizing high-dimensional data that has been categorized into various classes is considered. The goal in visualizing is to quickly absorb inter-class and intra-class relationships. Towards this end, class-preserving projections of the multidimensional data onto two-dimensional planes, which can be displayed on a computer screen, are introduced. These class-preserving projections maintain the high-dimensional class structure, and are closely related to Fisher's linear discriminants. By displaying sequences of such two-dimensional projections and by moving continuously from one projection to the next, an illusion of smooth motion through a multidimensional display can be created. Such sequences are called class tours. Furthermore, class-similarity graphs are overlaid on the two-dimensional projections to capture the distance relationships in the original high-dimensional space. The above visualization tools are illustrated on the classical Iris plant data, the ISOLET spoken letter data, and the PENDIGITS on-line handwriting data set. It is shown how the visual examination of the data can uncover latent class relationships.

[1]  Sougata Mukherjea,et al.  Glyphmaker: creating customized visualizations of complex data , 1994, Computer.

[2]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[3]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[4]  Juha Vesanto,et al.  SOM-based data visualization methods , 1999, Intell. Data Anal..

[5]  Andreas Buja,et al.  XGobi: Interactive Dynamic Data Visualization in the X Window System , 1998 .

[6]  Anil K. Jain,et al.  A nonlinear projection method based on Kohonen's topology preserving maps , 1992, IEEE Trans. Neural Networks.

[7]  J. Kruskal Nonmetric multidimensional scaling: A numerical method , 1964 .

[8]  Ronald A. Cole,et al.  Spoken Letter Recognition , 1990, HLT.

[9]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[10]  Anthony Ralston,et al.  Statistical Methods for Digital Computers. , 1980 .

[11]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[12]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[13]  John Riedl,et al.  A spreadsheet approach to information visualization , 1997, Proceedings of VIZ '97: Visualization Conference, Information Visualization Symposium and Parallel Rendering Symposium.

[14]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[15]  David G. Stork,et al.  Pattern Classification , 1973 .

[16]  Andreas Buja,et al.  Grand tour methods: an outline , 1986 .

[17]  A. Buja,et al.  Projection Pursuit Indexes Based on Orthonormal Function Expansions , 1993 .

[18]  Andreas Buja,et al.  Grand tour and projection pursuit , 1995 .

[19]  Daniel Asimov,et al.  The grand tour: a tool for viewing multidimensional data , 1985 .

[20]  Georges Grinstein,et al.  VISUALIZING MULTIDIMENSIONAL (MULTIVARIATE) DATA AND RELATIONS -- PERCEPTIONl vs GEOMETRY , 1995 .

[21]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[22]  Andreas Buja,et al.  Analyzing High-Dimensional Data with Motion Graphics , 1990, SIAM J. Sci. Comput..

[23]  Fevzi Alimo Methods of Combining Multiple Classiiers Based on Diierent Representations for Pen-based Handwritten Digit Recognition , 1996 .

[24]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[25]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[26]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[27]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[28]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[29]  ScienceDirect Computational statistics & data analysis , 1983 .

[30]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[31]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[32]  David A. Landgrebe,et al.  Analyzing High Dimensional Data , 1992, [Proceedings] IGARSS '92 International Geoscience and Remote Sensing Symposium.

[33]  Inderjit S. Dhillon,et al.  Visualizing Class Structure of Multidimensional Data , 1998 .

[34]  C. Radhakrishna Rao,et al.  Statistics and probability : essays in honor of C.R. Rao , 1983 .