Object localization and recognition for a grasping robot

This paper presents a vision system (object localization and recognition) for a grasping robot environment. The authors' approach to object localization is based on the sequential integration of early vision processes, such as color and edge detection. No assumptions about the object or background are necessary for this process. It detects blobs of interest in the scene and treats them as object candidates. The method presented here shows great reliability, flexibility and robustness. When the localization process completes, the next task is to classify the object in terms of its shape, orientation, size and exact position, as a basis for grasping. Recognition is achieved by comparing the image to stored two-dimensional object views. Stored views are represented as labelled graphs and are derived automatically from images of object models. Graph nodes are labelled by edge information, graph links by distance vectors in the image plane. Graphs emphasize occluding boundaries and inner object edges. These are identified by extracting local maxima in the Mallet wavelet transform of the image. Stored graphs are compared to test images by elastic matching. The system is robust with respect to surface markings and cluttered background. Their experiments demonstrate that the system is capable of fairly reliable object recognition and pose estimation in natural scenes.