Polyhedral Object Localization in an Image by Referencing to a Single Model View

Identifying a three-dimensional (3D) object in an image is traditionally dealt with by referencing to a 3D model of the object. In the last few years there has been a growing interest of using not a 3D shape but multiple views of the object as the reference. This paper attempts a further step in the direction, using not multiple views but a single clean view as the reference model. The key issue is how to establish correspondences from the model view where the boundary of the object is explicitly available, to the scene view where the object can be surrounded by various distracting entities and its boundary disturbed by noise. We propose a solution to the problem, which is based upon a mechanism of predicting correspondences from just four particular initial point correspondences. The object is required to be polyhedral or near-polyhedral. The correspondence mechanism has a computational complexity linear with respect to the total number of visible corners of the object in the model view. The limitation of the mechanism is also analyzed thoroughly in this paper. Experimental results over real images are presented to illustrate the performance of the proposed solution.

[1]  Thomas S. Huang,et al.  Motion and Structure from Orthographic Projections , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Thomas S. Huang,et al.  Motion and structure from orthographic projections , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[3]  R. Chung Rigidity constraints across two views under weak perspective , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[4]  Andrew Zisserman,et al.  Wide baseline stereo matching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[5]  Rachid Deriche,et al.  Robust Recovery of the Epipolar Geometry for an Uncalibrated Stereo Rig , 1994, ECCV.

[6]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[7]  Amnon Shashua,et al.  The Quadric Reference Surface: Theory and Applications , 2004, International Journal of Computer Vision.

[8]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[9]  Hakil Kim,et al.  A hierarchical approach to extracting polygons based on perceptual grouping , 1994, Proceedings of IEEE International Conference on Systems, Man and Cybernetics.

[10]  K. Ramesh Babu,et al.  Linear Feature Extraction and Description , 1979, IJCAI.

[11]  R. Chung,et al.  Stereo calibration from correspondences of OTV projections , 1995 .

[12]  David A. Forsyth,et al.  Extracting projective structure from single perspective views of 3D point sets , 1993, 1993 (4th) International Conference on Computer Vision.

[13]  Ronen Basri,et al.  Recognition by Linear Combinations of Models , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Andrew Zisserman,et al.  3D Motion recovery via affine Epipolar geometry , 1995, International Journal of Computer Vision.

[15]  Shimon Ullman,et al.  Recognizing solid objects by alignment with an image , 1990, International Journal of Computer Vision.

[16]  O. Faugeras Stratification of three-dimensional vision: projective, affine, and metric representations , 1995 .

[17]  John A. Orr,et al.  Applications of Tensor Theory to Object Recognition and Orientation Determination , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Amnon Shashua,et al.  Algebraic Functions For Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Olivier D. Faugeras,et al.  What can be seen in three dimensions with an uncalibrated stereo rig , 1992, ECCV.

[20]  Andrew Zisserman,et al.  Geometric invariance in computer vision , 1992 .

[21]  Ian D. Reid,et al.  Tracking foveated corner clusters using affine structure , 1993, 1993 (4th) International Conference on Computer Vision.