Translation-, Rotation-, Scale-, and Distortion-Invariant Object Recognition Through Self-Organization
暂无分享,去创建一个
The task of visual object recognition is often complicated by the fact that a single 3-D object can undergo a number of transformations which can substantially alter its projection onto a 2-D surface, such as the retina. Such transformations include translation of the object in the visual field, changes in the size of the object, its orientation in the 2-D plane and the viewing perspective. For a general pattern recognition system to detect and recognize and object after such transformations, it must be able to associate widely differing patterns with the same object label. In this paper, a novel self-organizing model, called the Multiple Elastic Modules (MEM), is presented which attempts to solve this problem by searching a multi-dimensional space, where each axis is defined by one of the transformations (e.g. scale, translation, rotation, etc.). A particular object of a specific size, orientation and spatial location is mapped onto a single point in this space. Of course, distortions and minor variations in an object's image will expand this point to a small localized area in this multi-dimensional space. Such a powerful representation scheme comes at a cost of high computational demand due to the combinatorially large search space. The MEM approach to solving this problem efficiently partitions the solution space to search the most promising areas for the correct match. Simulation results are presented on detecting a stick-figure object under translation, distortion, scale, and rotation transformations in a cluttered background.
[1] D. V. van Essen,et al. A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.