Generalization performance of factor analysis techniques used for image database organization

The goal of this paper is to evaluate the generalization performance of a variety of factor analysis techniques in an image database environment. Factor analysis techniques, such as Principal Components Analysis, have been proposed as means of reducing the dimensionality of the data stored in image retrieval systems. These techniques compute a transformation which is applied to vectors of image features to produce vectors of lower dimensionality which still characterize the original data well. Computing such transformations for very large numbers of images is computationally expensive, especially if this calculation must be repeated each time new images are added to the database. It is to be hoped, therefore, that a transformation computed using a subset of all possible images will perform well when applied to images not used in its derivation. To evaluate this generalization ability, we measure the agreement between partitionings of image sets computed using such transformations with those produced by human subjects.