A pixel based approach to view based object recognition with self-organizing neural networks

This paper addresses the pixel based classification of three dimensional objects from arbitrary views. To perform this task a coding strategy, inspired by the biological model of human vision, for pixel data is described. The coding strategy ensures that the input data is invariant against shift, scale and rotation of the object in the input domain. The image data is used as input to a class of self organizing neural networks, the Kohonen-maps or self-organizing feature maps (SOFM). To verify this approach two test sets have been generated: the first set, consisting of artificially generated images, is used to examine the classification properties of the SOFMs; the second test set examines the clustering capabilities of the SOFM when real world image data is applied to the network after it has been preprocessed to be invariant against shift, scale and rotation. It is shown that the clustering capability of the SOFM is strongly dependant on the invariance coding of the images.