Depth and motion discontinuities

Depth and motion discontinuities arise wherever a light ray incident on a camera sensor meets a discrete change in the depth or motion of the surfaces in the world. Because these discontinuities tend to coincide with occlusions and with the boundaries of objects, they provide useful information for a number of applications in computer vision, such as camera control, compression, and tracking. Moreover, because they have simple, precise definitions depending only upon the physics of the scene, they are unaffected by subjective considerations. The first part of this thesis presents an algorithm to detect depth discontinuities from a stereo pair of images by matching pixels in corresponding scanlines and then propagating information between those scanlines. It uses a new measure of pixel dissimilarity that is provably insensitive to image sampling. The algorithm is fast and is shown to produce good results on difficult images containing untextured, slanted surfaces. Then some work aimed at detecting motion discontinuities from a monocular image sequence is described, along with a discussion of why this is a harder problem. A preliminary algorithm is given which tracks sparse features throughout a sequence, groups them according to an affine motion model, and traces the boundaries between the groups. Results show that the technique is promising but needs future work. Then two recent maximum-flow-based stereo algorithms are compared, showing their ability to work well on textured, fronto-parallel surfaces as well as their inability to handle untextured, slanted surfaces. We extend one of the techniques to solve stereo or motion correspondence with textured, slanted surfaces. Finally, an algorithm that uses the knowledge of discontinuities to track a person's head is described. Modeling the head as an ellipse and concentrating on the intensity edges around the perimeter as well as the color of the interior, the system is able to automatically control a camera in real time to follow a person moving around an unmodified environment. Extensive experimentation shows the algorithm's robustness with respect to full 360-degree out-of-plane rotation, severe but brief occlusion, arbitrary camera movement, and a dynamic background.

[1]  R. Hetherington The Perception of the Visual World , 1952 .

[2]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[3]  D Marr,et al.  Cooperative computation of stereo disparity. , 1976, Science.

[4]  D Marr,et al.  A computational theory of human stereo vision. , 1979, Proceedings of the Royal Society of London. Series B, Biological sciences.

[5]  Hans P. Moravec Visual Mapping by a Robot Rover , 1979, IJCAI.

[6]  D Marr,et al.  Theory of edge detection , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[7]  Thomas O. Binford,et al.  Depth from Edge and Intensity Based Stereo , 1981, IJCAI.

[8]  Eric L. W. Grimson,et al.  From Images to Surfaces: A Computational Study of the Human Early Visual System , 1981 .

[9]  H. K. Nishihara,et al.  Practical Real-Time Imaging Stereo Matcher , 1984 .

[10]  J P Frisby,et al.  PMF: A Stereo Correspondence Algorithm Using a Disparity Gradient Limit , 1985, Perception.

[11]  W. Eric L. Grimson,et al.  Computational Experiments with a Feature Based Stereo Algorithm , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  M. Waterman Dynamic programming algorithms for picture comparison , 1985 .

[13]  William B. Thompson,et al.  Analysis of Accretion and Deletion at Boundaries in Dynamic Scenes , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Takeo Kanade,et al.  Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Stephen T. Barnard,et al.  A Stochastic Approach to Stereo Vision , 1986, AAAI.

[16]  Thomas O. Binford,et al.  On Detecting Edges , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yoshiaki Shirai,et al.  Three-Dimensional Computer Vision , 1987, Symbolic Computation.

[19]  Anselm Spoerri,et al.  The early detection of motion boundaries , 1990, ICCV 1987.

[20]  T. Poggio,et al.  Visual Integration and Detection of Discontinuities: The Key Role of Intensity Edges , 1987 .

[21]  L. Quam Hierarchical warp stereo , 1987 .

[22]  M. J. Hannah A system for digital stereo image matching , 1989 .

[23]  Daphna Weinshall,et al.  Integration of vision modules and labeling of surface discontinuities , 1989, IEEE Trans. Syst. Man Cybern..

[24]  Jitendra Malik,et al.  Finding Boundaries in Images , 1990, 1990 Conference Record Twenty-Fourth Asilomar Conference on Signals, Systems and Computers, 1990..

[25]  Michael J. Black,et al.  Constraints for the Early Detection of Discontinuity from Motion , 1990, AAAI.

[26]  Peng-Seng Toh,et al.  Occlusion detection in early vision , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[27]  James J. Little,et al.  Direct evidence for occlusion in stereo and motion , 1990, Image Vis. Comput..

[28]  Søren Ingvor Olsen Stereo Correspondence by Surface Reconstruction , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Pascal Fua,et al.  Combining Stereo and Monocular Information to Compute Dense Depth Maps that Preserve Depth Discontinuities , 1991, IJCAI.

[30]  Takeo Kanade,et al.  A multiple-baseline stereo , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Venu Govindaraju,et al.  A Computational Model for Face Location Based on Cognitive Principles , 1992, AAAI.

[32]  David Mumford,et al.  A Bayesian treatment of the stereo correspondence problem using half-occluded regions , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Gérard G. Medioni,et al.  3-D Surface Description from Binocular Stereo , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  John Iselin Woodfill,et al.  Motion vision and tracking for robots in dynamic, unstructured environments , 1992 .

[35]  Jitendra Malik,et al.  Computational framework for determining stereo correspondence from a set of linear spatial filters , 1992, Image Vis. Comput..

[36]  Daniel P. Huttenlocher,et al.  Tracking non-rigid objects in complex scenes , 1993, 1993 (4th) International Conference on Computer Vision.

[37]  Peter N. Belhumeur,et al.  A binocular stereo algorithm for reconstructing sloping, creased, and broken surfaces in the presence of half-occlusion , 1993, 1993 (4th) International Conference on Computer Vision.

[38]  Vishvjit S. Nalwa,et al.  A guided tour of computer vision , 1993 .

[39]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[40]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Michael J. Black Recursive Non-Linear Estimation of Discontinuous Flow Fields , 1994, ECCV.

[42]  David C. Hogg,et al.  An Eecient Method for Contour Tracking Using Active Shape Models , 1994 .

[43]  Anup Basu,et al.  Motion Tracking with an Active Camera , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  H. K. Nishihara,et al.  Real-time tracking of people using stereo and motion , 1994, Electronic Imaging.

[45]  Takeo Kanade,et al.  A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Aaron F. Bobick,et al.  Disparity-Space Images and Large Occlusion Stereo , 1994, ECCV.

[47]  Alex Waibel,et al.  Face locating and tracking for human-computer interaction , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[48]  Emanuele Trucco,et al.  Computer and Robot Vision , 1995 .

[49]  Ioannis Pitas,et al.  Segmentation and tracking of faces in color images , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[50]  Michael J. Black,et al.  Estimating Optical Flow in Segmented Images Using Variable-Order Parametric Models With Local Deformations , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[52]  Gregory D. Hager,et al.  Incremental focus of attention for robust visual tracking , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[53]  Edward H. Adelson,et al.  A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[54]  Ingemar J. Cox,et al.  A Maximum Likelihood Stereo Algorithm , 1996, Comput. Vis. Image Underst..

[55]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[56]  Jitendra Malik,et al.  On Binocularly Viewed Occlusion Junctions , 1996, ECCV.

[57]  Gregory D. Hager,et al.  Real-time tracking of image regions with changes in geometry and illumination , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[58]  Shaogang Gong,et al.  Tracking faces , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[59]  David C. Gibbon,et al.  Multi-modal system for locating heads and faces , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[60]  Wei-Chung Lin,et al.  Visual surface segmentation from stereo , 1997, Image Vis. Comput..

[61]  Andrew V. Goldberg,et al.  On Implementing the Push—Relabel Method for the Maximum Flow Problem , 1997, Algorithmica.

[62]  S. Birchfield,et al.  An elliptical head tracker , 1997, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[63]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[64]  Paul W. Fieguth,et al.  Color-based tracking of heads and other mobile objects at video frame rates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[65]  Laxmi Parida,et al.  Kona: A Multi-junction Detector Using Minimum Description Length Principle , 1997, EMMCVPR.

[66]  Ingemar J. Cox,et al.  A maximum-flow formulation of the N-camera stereo correspondence problem , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[67]  David H. Marimont,et al.  A probabilistic framework for edge detection and scale selection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[68]  Carlo Tomasi,et al.  A Pixel Dissimilarity Measure That Is Insensitive to Image Sampling , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  Stanley T. Birchfield,et al.  Elliptical head tracking using intensity gradients and color histograms , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[70]  Davi Geiger,et al.  Segmentation by grouping junctions , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[71]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[72]  Shaogang Gong,et al.  Segmentation and Tracking Using Color Mixture Models , 1998, ACCV.

[73]  Michael Isard,et al.  Active Contours: The Application of Techniques from Graphics, Vision, Control Theory and Statistics to Visual Tracking of Shapes in Motion , 2000 .

[74]  Carlo Tomasi,et al.  Color edge detection with the compass operator , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[75]  Carlo Tomasi,et al.  Multiway cut for stereo and motion with slanted surfaces , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[76]  Carlo Tomasi,et al.  Corner detection in textured color images , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[77]  Refractor Vision , 2000, The Lancet.