Affine structure from motion.

A mobile observer samples sequences of narrow-field projections of configurations in ambient space. The so-called structure-from-motion problem is to infer the structure of these spatial configurations from the sequence of projections. For rigid transformations, a unique metrical reconstruction is known to be possible from three orthographic views of four points. However, human observers seem able to obtain much shape information from a mere pair of views, as is evident in the case of binocular stereo. Moreover, human observers seem to find little use for the information provided by additional views, even though some improvement certainly occurs. The rigidity requirement in its strict form is also relaxed. We indicate how solutions of the structure-from-motion problem can be stratified in such a way that one explicitly knows at which stages various a priori assumptions enter and specific geometrical expertise is required. An affine stage is identified at which only smooth deformation is assumed (thus no rigidity constraint is involved) and no metrical concepts are required. This stage allows one to find the spatial configuration (modulo an affinity) from two views. The addition of metrical methods allows one to find shape from two views, modulo a relief transformation (depth scaling and shear). The addition of a third view then merely serves to settle the calibration. Results of a numerical experiment are discussed.