Stratification of three-dimensional vision: projective, affine, and metric representations

A conceptual framework is provided in which to think of the relationships between the three-dimensional structure of physical space and the geometric properties of a set of cameras that provide pictures from which measurements can be made. We usually think of physical space as being embedded in a three-dimensional Euclidean space, in which measurements of lengths and angles do make sense. It turns out that for artificial systems, such as robots, this is not a mandatory viewpoint and that it is sometimes sufficient to think of physical space as being embedded in an affine or even a projective space. The question then arises of how to relate these models to image measurements and to geometric properties of sets of cameras. It is shown that, in the case of two cameras, a stereo rig, the projective structure of the world can be recovered as soon as the epipolar geometry of the stereo rig is known and that this geometry is summarized by a single 3 × 3 matrix, which is called the fundamental matrix. The affine structure can then be recovered if to this information is added a projective transformation between the two images that is induced by the plane at infinity. Finally, the Euclidean structure (up to a similitude) can be recovered if to these two elements is added the knowledge of two conics (one for each camera) that are the images of the absolute conic, a circle of radius -1 in the plane at infinity. In all three cases it is shown how the three-dimensional information can be recovered directly from the images without explicit reconstruction of the scene structure. This defines a natural hierarchy of geometric structures, a set of three strata that is overlaid upon the physical world and that is shown to be recoverable by simple procedures that rely on two items, the physical space itself together with possibly, but not necessarily, some a priori information about it, and some voluntary motions of the set of cameras.

[1]  J. G. Semple,et al.  Algebraic Projective Geometry , 1953 .

[2]  H. C. Longuet-Higgins,et al.  A computer algorithm for reconstructing a scene from two projections , 1981, Nature.

[3]  Dana H. Ballard,et al.  Computer Vision , 1982 .

[4]  Gene H. Golub,et al.  Matrix computations , 1983 .

[5]  Jacques Droulez,et al.  Adaptive changes in perceptual responses and visuomanual coordination during exposure to visual metrical distortion , 1986, Vision Research.

[6]  李幼升,et al.  Ph , 1989 .

[7]  Olivier D. Faugeras,et al.  Some Properties of the E Matrix in Two-View Motion Estimation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  S. Maybank Properties of essential matrices , 1990, Int. J. Imaging Syst. Technol..

[9]  G. Sparr Projective invariants for affine shapes of point configurations , 1991 .

[10]  G. Sparr An algebraic/analytic method for reconstruction from image correspondences , 1991 .

[11]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[12]  O. Faugeras,et al.  Mouvement à partir de points : nombre de solutions , 1991 .

[13]  Olivier D. Faugeras,et al.  What can be seen in three dimensions with an uncalibrated stereo rig , 1992, ECCV.

[14]  Rajiv Gupta,et al.  Stereo from uncalibrated cameras , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  O. D. Faugeras,et al.  Camera Self-Calibration: Theory and Experiments , 1992, ECCV.

[16]  Quang-Tuan Luong Matrice Fondamentale et Calibration Visuelle sur l''''Environnement - Vers une plus grande autonomie , 1992 .

[17]  Enrico Grosso,et al.  Relative positioning with uncalibrated cameras , 1992 .

[18]  L. Gool,et al.  Affine structure from perspective image pairs under relative translations between object and camera , 1993 .

[19]  Roger Mohr,et al.  Euclidean constraints for uncalibrated reconstruction , 1993, 1993 (4th) International Conference on Computer Vision.

[20]  Amnon Shashua,et al.  Projective depth: A geometric invariant for 3D reconstruction from two perspective/orthographic views and for visual recognition , 1993, 1993 (4th) International Conference on Computer Vision.

[21]  Laurent Moll,et al.  Real time correlation-based stereo: algorithm, implementations and applications , 1993 .

[22]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[23]  Olivier Faugeras,et al.  Applications of non-metric vision to some visual guided tasks , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[24]  Gunnar Sparr,et al.  A Common Framework for Kinetic Depth, Reconstruction and Motion for Deformable Objects , 1994, ECCV.

[25]  Amnon Shashua,et al.  Projective Structure from Uncalibrated Images: Structure From Motion and Recognition , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Thierry Viéville,et al.  Canonic Representations for the Geometries of Multiple Projective Views , 1994, ECCV.

[27]  G. Sparr,et al.  On a theorem of M. Chasles , 1994 .

[28]  Rachid Deriche,et al.  Robust Recovery of the Epipolar Geometry for an Uncalibrated Stereo Rig , 1994, ECCV.

[29]  Richard I. Hartley Self-Calibration from Multiple Views with a Rotating Camera , 1994, ECCV.

[30]  Gunnar Sparr Applications of a theorem of Chasles to computer vision , 1994 .

[31]  Amnon Shashua,et al.  Algebraic Functions For Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..