AutoPaint: a toolkit for visualizing data in four or more dimensions

This dissertation describes a collection of computational and graphical tools for finding and displaying structure in high dimensional point clouds. The toolkit is organized around the two basic principles of focusing and linking (McDonald et al (1990)). Focusing tools let the analyst browse the data and home in on aspects of interest. Linking tools serve to connect multiple focused views and allow integration of the information into a coherent image of the data as a whole. In this dissertation I develop automatic and assisted methods that use color to link information across focused views. To assign observations to disjoint groups which can then be colored, I propose algorithms for detecting cluster structure using single linkage hierarchical clustering and minimal spanning tree methods. I propose a method of sharpening, to thin the data and highlight high density regions so as to more easily identify cluster cores when clusters are present. I also develop automatic focusing methods for selecting low dimensional projections that show chosen data subgroups as well separated as possible from the rest of the data, to assist in understanding the relative positions of subgroups in the data space.