Direct-motion stereo

In this paper, we show how the translational motion of a stereo vision system relative to, and its distance from, the scene can be recovered in closed form directly from the measurements of image gradients and time derivatives. There is no need to estimate image motion or establish correspondences between features across images. The direction of translational motion is recovered using a procedure which involves minimizing the sum squared error of a linear constraint equation over the image. The solution is given in terms of the eigenvector corresponding to the smallest eigenvalue of a 3 x 3 positive semi-definite matrix. Using the average disparity, which maximizes the crosscorrelation between the left and right images, we estimate the scale-factor necessary to compute the magnitude of the translational motion, and consequently the distance to the scene.