Occlusion Boundaries from Motion: Low-Level Detection and Mid-Level Reasoning

The boundaries of objects in an image are often considered a nuisance to be “handled” due to the occlusion they exhibit. Since most, if not all, computer vision techniques aggregate information spatially within a scene, information spanning these boundaries, and therefore from different physical surfaces, is invariably and erroneously considered together. In addition, these boundaries convey important perceptual information about 3D scene structure and shape. Consequently, their identification can benefit many different computer vision pursuits, from low-level processing techniques to high-level reasoning tasks.While much focus in computer vision is placed on the processing of individual, static images, many applications actually offer video, or sequences of images, as input. The extra temporal dimension of the data allows the motion of the camera or the scene to be used in processing. In this paper, we focus on the exploitation of subtle relative-motion cues present at occlusion boundaries. When combined with more standard appearance information, we demonstrate these cues’ utility in detecting occlusion boundaries locally. We also present a novel, mid-level model for reasoning more globally about object boundaries and propagating such local information to extract improved, extended boundaries.

[1]  Yair Weiss,et al.  Interpreting Images by Propagating Bayesian Beliefs , 1996, NIPS.

[2]  M. Landy,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .

[3]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Olivier D. Faugeras,et al.  Using Extremal Boundaries for 3-D Object Modeling , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Patrick Bouthemy,et al.  An a contrario Decision Framework for Region-Based Motion Detection , 2006, International Journal of Computer Vision.

[6]  Patrick Henry Winston,et al.  The psychology of computer vision , 1976, Pattern Recognit..

[7]  Paul Smith,et al.  Layered motion segmentation and depth ordering by tracking edges , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Hai Tao,et al.  A global matching framework for stereo computation , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[9]  Jitendra Malik,et al.  Cue Integration for Figure/Ground Labeling , 2005, NIPS.

[10]  Edward H. Adelson,et al.  Probability distributions of optical flow , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Alan L. Yuille,et al.  Statistical Edge Detection: Learning and Evaluating Edge Cues , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  Alex Pentland,et al.  Cooperative Robust Estimation Using Layers of Support , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Lattre de Tassigny Boundary Extraction in Natural Images Using Ultrametric Contour Maps , 2006 .

[15]  Dima Damen,et al.  Detecting Carried Objects in Short Video Sequences , 2008, ECCV.

[16]  Edward H. Adelson,et al.  Analysis of Contour Motions , 2006, NIPS.

[17]  Martial Hebert,et al.  Occlusion boundaries: low-level detection to high-level reasoning , 2008 .

[18]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[19]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[20]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[21]  Paul Smith,et al.  Edge-based motion segmentation , 2002 .

[22]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[23]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[24]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[25]  David J. Fleet,et al.  Probabilistic tracking of motion boundaries with spatiotemporal predictions , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26]  Martial Hebert,et al.  A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[28]  Konstantinos G. Derpanis,et al.  Three-dimensional nth derivative of Gaussian separable steerable filters , 2005, IEEE International Conference on Image Processing 2005.

[29]  Alexei A. Efros,et al.  Recovering Occlusion Boundaries from a Single Image , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[30]  Greg Mori,et al.  Guiding model search using segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[31]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Jitendra Malik,et al.  Figure/Ground Assignment in Natural Images , 2006, ECCV.

[33]  Carlo Tomasi,et al.  Color edge detection with the compass operator , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[34]  Michael S. Landy,et al.  Computational models of visual processing , 1991 .

[35]  Alan L. Yuille,et al.  CCCP Algorithms to Minimize the Bethe and Kikuchi Free Energies: Convergent Alternatives to Belief Propagation , 2002, Neural Computation.

[36]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[37]  Judea Pearl,et al.  Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach , 1982, AAAI.

[38]  Martial Hebert,et al.  Combining Local Appearance and Motion Cues for Occlusion Boundary Detection , 2007, BMVC.

[39]  Emanuele Trucco,et al.  Efficient stereo with multiple windowing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[41]  David J. Fleet,et al.  Optical Flow Estimation , 2006, Handbook of Mathematical Models in Computer Vision.

[42]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[43]  Martial Hebert,et al.  Using Spatio-Temporal Patches for Simultaneous Estimation of Edge Strength, Orientation, and Motion , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[44]  David J. Kriegman,et al.  Curve and Surface Duals and the Recognition of Curved 3D Objects from their Silhouettes , 2004, International Journal of Computer Vision.

[45]  Martial Hebert,et al.  Discriminative Random Fields , 2006, International Journal of Computer Vision.

[46]  Jitendra Malik,et al.  Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[47]  Jianbo Shi,et al.  Perceiving Shapes through Region and Boundary Interaction , 2001 .

[48]  Roberto Cipolla,et al.  Affine Reconstruction of Curved Surfaces from Uncalibrated Views of Apparent Contours , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[49]  Takeo Kanade,et al.  A robust subspace approach to layer extraction , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..

[50]  Bruce A. Maxwell,et al.  Texture Edge Detection Using the Compass Operator , 2003, BMVC.

[51]  Jitendra Malik,et al.  Contour Continuity in Region Based Image Segmentation , 1998, ECCV.

[52]  Martial Hebert,et al.  Towards unsupervised whole-object segmentation: Combining automated matting with boundary detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[54]  Lior Wolf,et al.  Patch-Based Texture Edges and Segmentation , 2006, ECCV.

[55]  Marc Pollefeys,et al.  3D Occlusion Inference from Silhouette Cues , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  David J. Fleet,et al.  Probabilistic detection and tracking of motion discontinuities , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[57]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[58]  Mubarak Shah,et al.  Accurate motion layer segmentation and matting , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[59]  Patrick Bouthemy,et al.  A Maximum Likelihood Framework for Determining Moving Edges , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[61]  Irfan A. Essa,et al.  Tree-based Classifiers for Bilayer Video Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Takeo Kanade,et al.  A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Alexei A. Efros,et al.  Automatic photo pop-up , 2005, SIGGRAPH 2005.

[64]  Yoram Singer,et al.  Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.

[65]  Zhuowen Tu,et al.  Supervised Learning of Edges and Object Boundaries , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[66]  Lance R. Williams,et al.  Segmentation of Multiple Salient Closed Contours from Real Images , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[67]  Hui Cheng,et al.  Bilateral Filtering-Based Optical Flow Estimation with Occlusion Detection , 2006, ECCV.

[68]  Leslie Pack Kaelbling,et al.  Learning Static Object Segmentation from Motion Segmentation , 2005, AAAI.

[69]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[70]  Jonathan M. Garibaldi,et al.  Real-Time Correlation-Based Stereo Vision with Reduced Border Errors , 2002, International Journal of Computer Vision.

[71]  Patrick Bouthemy,et al.  Multimodal Estimation of Discontinuous Optical Flow using Markov Random Fields , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[72]  David J. Fleet,et al.  Probabilistic Detection and Tracking of Motion Boundaries , 2000, International Journal of Computer Vision.

[73]  Jean Ponce,et al.  The Local Projective Shape of Smooth Surfaces and Their Outlines , 2005, International Journal of Computer Vision.

[74]  Alexei A. Efros,et al.  Improving Spatial Support for Objects via Multiple Segmentations , 2007, BMVC.

[75]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[76]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[77]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[78]  Brendan J. Frey,et al.  Graphical Models for Machine Learning and Digital Communication , 1998 .

[79]  David L. Waltz,et al.  Understanding Line drawings of Scenes with Shadows , 1975 .

[80]  Hilbert J. Kappen,et al.  Approximate Inference and Constrained Optimization , 2002, UAI.

[81]  Martial Hebert,et al.  Local detection of occlusion boundaries in video , 2009, Image Vis. Comput..

[82]  Andrew Zisserman,et al.  Learning Layered Motion Segmentations of Video , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[83]  David J. Heeger,et al.  Optical flow using spatiotemporal filters , 2004, International Journal of Computer Vision.

[84]  Roberto Cipolla,et al.  Application of Lie Algebras to Visual Servoing , 2000, International Journal of Computer Vision.

[85]  Michal Irani,et al.  Motion Analysis for Image Enhancement: Resolution, Occlusion, and Transparency , 1993, J. Vis. Commun. Image Represent..

[86]  David J. Fleet,et al.  A Layered Motion Representation with Occlusion and Compact Spatial Support , 2002, ECCV.

[87]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[88]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[89]  Yiannis Aloimonos,et al.  Motion segmentation using occlusions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[90]  Frank Wolter,et al.  Exploring Artificial Intelligence in the New Millenium , 2002 .

[91]  Adolfo Guzman,et al.  Decomposition of a visual scene into three-dimensional bodies , 1968 .

[92]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[93]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[94]  David J. Fleet,et al.  Bayesian inference of visual motion boundaries , 2003 .

[95]  Martial Hebert,et al.  Learning to Find Object Boundaries Using Motion Cues , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[96]  Martial Hebert,et al.  Incorporating Background Invariance into Feature-Based Object Recognition , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[97]  Irfan A. Essa,et al.  Motion based decompositing of video , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.