Autonomous systems which learn and utilize a limited visual vocabulary have wide spread applications. Enabling such systems to segment a set of cluttered scenes into objects is a challenging vision problem owing to the non-homogeneous texture of objects and the random configurations of multiple objects in each scene. We present a solution to the following question: given a collection of images where each object appears in one or more images and multiple objects occur in each image, how best can we extract the boundaries of the different objects? The algorithm is presented with a set of stereo images, with one stereo pair per scene. The novelty of our work is the use of both color/texture and structure to refine previously determined object boundaries to achieve segmentation consistent with each of the input scenes presented. The algorithm populates an object library, which consists of a 3D model per object. Since an object is characterized both by texture and structure, for most purposes this representation is both complete and concise.
[1]
Pietro Perona,et al.
Learning object categories from Google's image search
,
2005,
Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[2]
Stefanos Kollias,et al.
Unsupervised semantic object segmentation of stereoscopic video sequences
,
1999,
Proceedings 1999 International Conference on Information Intelligence and Systems (Cat. No.PR00446).
[3]
Hyeran Byun,et al.
Robust Object Segmentation Using Graph Cut with Object and Background Seed Estimation
,
2006,
18th International Conference on Pattern Recognition (ICPR'06).
[4]
M. Bravo,et al.
Object segmentation by top-down processes
,
2003
.
[5]
Berthold K. P. Horn,et al.
Closed-form solution of absolute orientation using unit quaternions
,
1987
.