Abstract This paper presents techniques for using objects in a scene to define the reference frames for 3D reconstruction. We first present a simple technique to calibrate an orthographic projection from four non-coplanar reference points. We then show how the observation of two additional known scene points can provide the complete perspective projection. When used with a known object, this technique permits calibration of the full projective transformation matrix without matrix inversion. For an arbitrary non-coplanar set of four (or six) points, this calibration provides an affine basis for the reconstruction of local scene structure. When the four points define three orthogonal vectors, the basis is orthogonal, with a metric defined by the lengths of the three vectors. We demonstrate this technique for the case of a cube. We present results in which five and a half points on the cube are sufficient to compute the projective transformation for an orthogonal basis by direct observation (without matrix inversion). We then present experiments with three techniques for reducing the imprecision due to errors in the positions of the reference point because of pixel quantization and noise. We provide experimental measurements of the stability of the stereo reconstruction as a function of the error in the observed pixel position of the reference points used for calibration. A major problem in active 3D vision is updating a calibration matrix as the camera focus, aperture, zoom or vergence angle is changed. We present a technique for correcting the projective transformation matrix by tracking reference points. Our experiments show that correcting for a change of focus can be corrected by a affine transform, obtained by tracking three points. A camera rotation near the principal point, such as with stereo vergence, is slightly more precise with a projective correction matrix obtained by tracking at least four points. We then show how stereo reconstruction permits us to ‘hop’ the reference frame from the known calibration object to a new reference frame defined by reconstruction of four non-coplanar points. Any set of four non-coplanar points in the scene may define such a reference frame. We also show how to keep the reference frame locked onto a set of four points as the stereo head is translated or rotated. These techniques make it possible to reconstruct the shape of an object in its intrinsic coordinates without having to match new observations to a partially reconstructed description.
[1]
J J Koenderink,et al.
Affine structure from motion.
,
1991,
Journal of the Optical Society of America. A, Optics and image science.
[2]
James L. Crowley,et al.
Measuring Image Flow By Tracking Edge-lines
,
1988,
[1988 Proceedings] Second International Conference on Computer Vision.
[3]
Roger Y. Tsai,et al.
A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses
,
1987,
IEEE J. Robotics Autom..
[4]
Luce Morin,et al.
Relative Positioning with Poorly Calibrated Cameras
,
1990
.
[5]
James L. Crowley,et al.
Towards Continuously Operating Integrated Vision Systems for Robotics Applications
,
1992
.
[6]
Gunnar Sparr.
Depth computations from polyhedral images
,
1992,
Image Vis. Comput..
[7]
Thomas Skordas,et al.
Calibrating a mobile camera
,
1990,
Image Vis. Comput..