Shape and motion from image streams under orthography: a factorization method

Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orthography without computing depth as an intermediate step.An image stream can be represented by the 2F×P measurement matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this matrix is of rank 3.Based on this observation, the factorization method uses the singular-value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively. Two of the three translation components are computed in a preprocessing stage. The method can also handle and obtain a full solution from a partially filled-in measurement matrix that may result from occlusions or tracking failures.The method gives accurate results, and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions.

[1]  S. Ullman The Interpretation of Visual Motion , 1979 .

[2]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[3]  Berthold K. P. Horn,et al.  Passive navigation , 1982, Computer Vision Graphics and Image Processing.

[4]  Gene H. Golub,et al.  Matrix computations , 1983 .

[5]  Thomas S. Huang,et al.  Uniqueness and Estimation of Three-Dimensional Motion Parameters of Rigid Objects with Curved Surfaces , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Gilad Adiv,et al.  Determining Three-Dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Allen M. Waxman,et al.  Contour Evolution, Neighborhood Deformation, and Global Image Flow: Planar Surfaces in Motion , 1985 .

[8]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using orthonormal matrices , 1988 .

[9]  J. Aloimonos,et al.  Optimal motion estimation , 1989, [1989] Proceedings. Workshop on Visual Motion.

[10]  Joachim Heel,et al.  Dynamic Motion Vision , 1989, Other Conferences.

[11]  Allan D. Jepson,et al.  Visual Perception of Three-Dimensional Motion , 1990, Neural Computation.

[12]  R. Chellappa,et al.  Recursive 3-D motion estimation from a monocular image sequence , 1990 .

[13]  Takeo Kanade,et al.  Shape and motion without depth , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[14]  T. Boult,et al.  Factorization-based segmentation of motions , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[15]  Narendra Ahuja,et al.  Motion and Structure Factorization and Segmentation of Long Multiple Motion Image Sequences , 1992, ECCV.

[16]  C Tomasi,et al.  Shape and motion from image streams: a factorization method. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Larry H. Matthies,et al.  Kalman filter-based algorithms for estimating depth from image sequences , 1989, International Journal of Computer Vision.

[18]  K. Prazdny,et al.  Egomotion and relative depth map from optical flow , 2004, Biological Cybernetics.

[19]  Robert C. Bolles,et al.  Epipolar-plane image analysis: An approach to determining structure from motion , 1987, International Journal of Computer Vision.