Tutorial: Computer vision—towards a three-dimensional world

Abstract This paper surveys theory and techniques relevant to the design of three-dimensional computer vision systems for intelligent robots and other applications. Being different from two-dimensional vision, 3-D vision uses information not only about the projected boundaries of objects, but also about the shapes of their surfaces, and the ranges between the objects and the camera. The range information may be acquired either by means of direct measurements based on laser light or ultrasound reflectance, or by indirect computational approaches, such as stereo vision or structural lighting. The calibration of camera systems and space coordinates is necessary if vision is to be used for the precise location of specific targets. Information about surface shape may be recovered from the variation of brightness in images, and methods used for such purposes are often called ‘shape from shading’. Shape can also be recovered by detecting the deformation of contours or textures on the object's surfaces, or by calculating the optical flow diagrams for moving objects. From time-varying image sequences, motion parameters (including rotation and translation in 3-D space) can be estimated, based on assumptions of rigidity of objects. Two approaches, feature-based and optical-flow-based, may be used for this task. Turning to practical applications, it is important that 3-D vision systems should be able to recognize automatically the objects viewed in a scene. For recognition, 3-D objects should first be modelled in terms of appropriate parameters or data structures, in order that matching procedures may be performed between unknown objects and modelled objects. Appropriate techniques, and the principles on which they are based, are reviewed. As an example, a stereo computer vision system for recognition and location of polyhedral objects is illustrated.

[1]  Thomas S. Huang,et al.  Three-dimensional motion estimation from image-space shifts , 1981, ICASSP.

[2]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[3]  Ray A. Jarvis,et al.  A Perspective on Range Finding Techniques for Computer Vision , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Berthold K. P. Horn The Binford-Horn LINE-FINDER , 1973 .

[5]  S. Ullman The interpretation of structure from motion , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[6]  B.-Z. Yuan,et al.  Using vision technique for the bridge deformation detection , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7]  W. Eric L. Grimson,et al.  Computational Experiments with a Feature Based Stereo Algorithm , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jr. Andrew C. Staugaard Robotics and Ai: An Introduction to Applied Machine Intelligence , 1987 .

[9]  Yuan Bao-Zong,et al.  Feature-based motion stereo matching , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[10]  Ramakant Nevatia,et al.  Segment-based stereo matching , 1985, Comput. Vis. Graph. Image Process..

[11]  Akira Ishii,et al.  Three-View Stereo Analysis , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  M. Hebert,et al.  The Representation, Recognition, and Positioning of 3-D Shapes from Range Data , 1987 .

[13]  W F Clocksin,et al.  Perception of Surface Slant and Edge Labels from Optical Flow: A Computational Approach , 1980, Perception.

[14]  Thomas S. Huang,et al.  Estimation of rigid body motion using straight line correspondences , 1986, Comput. Vis. Graph. Image Process..

[15]  H.-H. Nagel,et al.  Representation of Moving Rigid Objects Based on Visual Observations , 1981, Computer.

[16]  Berthold K. P. Horn Image Intensity Understanding , 1975 .

[17]  Dana H. Ballard,et al.  Computer Vision , 1982 .

[18]  Berthold K. P. Horn SHAPE FROM SHADING: A METHOD FOR OBTAINING THE SHAPE OF A SMOOTH OPAQUE OBJECT FROM ONE VIEW , 1970 .

[19]  Robert J. Woodham,et al.  Photometric method for determining surface orientation from multiple images , 1980 .

[20]  R Boite Circuit theory and design: Proceedings of the 1978 European Conference on Circuit Theory and Design, Georgi Publishing Company CH 1813, St. Saphorin, Switzerland, Sept. 1978, 624 pp., 15 x 21 cm2, hardcover, ISBN 2-604-00033-44, SFr. 95-. , 1980 .

[21]  Berthold K. P. Horn,et al.  Determining Shape and Reflectance Using Multiple Images , 1978 .

[22]  G. Johansson Studies on Visual Perception of Locomotion , 1977, Perception.

[23]  Thomas S. Huang,et al.  Estimating three-dimensional motion parameters of a rigid planar patch, II: Singular value decomposition , 1982 .

[24]  Berthold K. P. Horn Understanding Image Intensities , 1977, Artif. Intell..

[25]  J. Aggarwal,et al.  Motion Understanding: Robot and Human Vision , 1988 .

[26]  Y. Aloimonos,et al.  Visual shape computation , 1988, Proc. IEEE.

[27]  Thomas S. Huang,et al.  Estimating three-dimensional motion parameters of a rigid planar patch , 1981 .

[28]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[29]  Ray A. Jarvis,et al.  A Laser Time-of-Flight Range Scanner for Robotic Vision , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Thomas S. Huang,et al.  Uniqueness and Estimation of Three-Dimensional Motion Parameters of Rigid Objects with Curved Surfaces , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jake K. Aggarwal,et al.  On the computation of motion from sequences of images-A review , 1988, Proc. IEEE.

[32]  Martin D. Levine,et al.  Computer determination of depth maps , 1973, Comput. Graph. Image Process..