Recovering Occlusion Boundaries from an Image

Occlusion reasoning is a fundamental problem in computer vision. In this paper, we propose an algorithm to recover the occlusion boundaries and depth ordering of free-standing structures in the scene. Rather than viewing the problem as one of pure image processing, our approach employs cues from an estimated surface layout and applies Gestalt grouping principles using a conditional random field (CRF) model. We propose a hierarchical segmentation process, based on agglomerative merging, that re-estimates boundary strength as the segmentation progresses. Our experiments on the Geometric Context dataset validate our choices for features, our iterative refinement of classifiers, and our CRF model. In experiments on the Berkeley Segmentation Dataset, PASCAL VOC 2008, and LabelMe, we also show that the trained algorithm generalizes to other datasets and can be used as an object boundary predictor with figure/ground labels.

[1]  M. Wertheimer Laws of organization in perceptual forms. , 1938 .

[2]  J. Gibson The perception of the visual world , 1951 .

[3]  Adolfo Guzmán-Arenas,et al.  COMPUTER RECOGNITION OF THREE-DIMENSIONAL OBJECTS IN A VISUAL SCENE , 1968 .

[4]  David L. Waltz,et al.  Understanding Line drawings of Scenes with Shadows , 1975 .

[5]  J.K. Aggarwal,et al.  Computer analysis of scenes with curved objects , 1979, Proceedings of the IEEE.

[6]  Takeo Kanade,et al.  A Theory of Origami World , 1979, Artif. Intell..

[7]  Stephen W. Draper,et al.  The Use of Gradient and Dual Space in Line-Drawing Interpretation , 1981, Artif. Intell..

[8]  Kokichi Sugihara,et al.  An Algebraic Approach to Shape-from-Image Problems , 1984, Artif. Intell..

[9]  Kokichi Sugihara,et al.  A Necessary and Sufficient Condition for a Picture to Represent a Polyhedral Scene , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  S. Sutherland Seeing things , 1989, Nature.

[12]  David Mumford,et al.  The 2.1-D sketch , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[13]  Olivier D. Faugeras,et al.  Using Extremal Boundaries for 3-D Object Modeling , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  I Kovács,et al.  A closed curve is much more than an incomplete one: effect of closure in figure-ground segmentation. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[15]  David W. Jacobs Robust and efficient detection of convex groups , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Radu Horaud,et al.  Figure-Ground Discrimination: A Combinatorial Optimization Approach , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Steven W. Zucker,et al.  Computing Contour Closure , 1996, ECCV.

[18]  Narendra Ahuja,et al.  A Transform for Multiscale Image Segmentation by Integrated Edge and Region Detection , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Pietro Perona,et al.  A Factorization Approach to Grouping , 1998, ECCV.

[21]  Jitendra Malik,et al.  Contour Continuity in Region Based Image Segmentation , 1998, ECCV.

[22]  Michael Lindenbaum,et al.  A Generic Grouping Algorithm and Its Quantitative Analysis , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  David J. Fleet,et al.  Probabilistic detection and tracking of motion discontinuities , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[24]  J. Bakin,et al.  Visual Responses in Monkey Areas V1 and V2 to Three-Dimensional Surface Configurations , 2000, The Journal of Neuroscience.

[25]  Sudeep Sarkar,et al.  Supervised Learning of Large Perceptual Organization: Graph Spectral Partitioning and Learning Automata , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[27]  Ian H. Jermyn,et al.  Globally Optimal Regions and Boundaries as Minimum Ratio Weight Cycles , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Kenji Shoji,et al.  3-D interpretation of single line drawings based on entropy minimization principle , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[29]  Alan L. Yuille,et al.  CCCP Algorithms to Minimize the Bethe and Kikuchi Free Energies: Convergent Alternatives to Belief Propagation , 2002, Neural Computation.

[30]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[31]  Lance R. Williams,et al.  Segmentation of Multiple Salient Closed Contours from Real Images , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Hilbert J. Kappen,et al.  Approximate Inference and Constrained Optimization , 2002, UAI.

[33]  David J. Fleet,et al.  Probabilistic Detection and Tracking of Motion Boundaries , 2000, International Journal of Computer Vision.

[34]  Martin A. Fischler,et al.  An optimization-based approach to the interpretation of single line drawings as 3D wire frames , 1992, International Journal of Computer Vision.

[35]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Paul Smith,et al.  Layered motion segmentation and depth ordering by tracking edges , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Josh H. McDermott,et al.  Psychophysics with junctions in real images. , 2010, Perception.

[38]  Thomas Marill,et al.  Emulating the human interpretation of line-drawings as three-dimensional objects , 1991, International Journal of Computer Vision.

[39]  Jitendra Malik,et al.  Interpreting line drawings of curved objects , 1986, International Journal of Computer Vision.

[40]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[41]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[42]  Yoram Singer,et al.  Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.

[43]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[44]  E. Saund Logic and MRF Circuitry for Labeling Occluding and Thinline Visual Contours , 2005, NIPS.

[45]  Jianbo Shi,et al.  Spectral segmentation with multiscale graph decomposition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[46]  Alexei A. Efros,et al.  Automatic photo pop-up , 2005, SIGGRAPH 2005.

[47]  Ashutosh Saxena,et al.  Learning Depth from Single Monocular Images , 2005, NIPS.

[48]  Liangliang Cao,et al.  3D object reconstruction from a single 2D line drawing without hidden lines , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[49]  Martial Hebert,et al.  Local detection of occlusion boundaries in video , 2009, Image Vis. Comput..

[50]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[51]  Martial Hebert,et al.  Using Spatio-Temporal Patches for Simultaneous Estimation of Edge Strength, Orientation, and Motion , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[52]  Andrew W. Fitzgibbon,et al.  Learning Class-Specific Edges for Object Detection and Segmentation , 2006, ICVGIP.

[53]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[54]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[55]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[56]  Jitendra Malik,et al.  Figure/Ground Assignment in Natural Images , 2006, ECCV.

[57]  Pablo Andrés Arbeláez,et al.  Boundary Extraction in Natural Images Using Ultrametric Contour Maps , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[58]  Daphne Koller,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[59]  Martial Hebert,et al.  Learning to Find Object Boundaries Using Motion Cues , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[60]  Ashutosh Saxena,et al.  3-D Depth Reconstruction from a Single Still Image , 2007, International Journal of Computer Vision.

[61]  Alexei A. Efros,et al.  Photo clip art , 2007, SIGGRAPH 2007.

[62]  Thomas Hofmann,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2007 .

[63]  Alexei A. Efros,et al.  Photo clip art , 2007, ACM Trans. Graph..

[64]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression , 2007, J. Mach. Learn. Res..

[65]  Hod Lipson,et al.  Optimization-based reconstruction of a 3D object from a single freehand line drawing , 1996, Comput. Aided Des..

[66]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Alexei A. Efros,et al.  Closing the loop in scene interpretation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[68]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[69]  Stephen Gould,et al.  Region-based Segmentation and Object Detection , 2009, NIPS.

[70]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[71]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, CVPR.

[72]  Michael Lindenbaum,et al.  Boundary ownership by lifting to 2.1D , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[73]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[74]  Andrew Zisserman,et al.  OBJCUT: Efficient Segmentation Using Top-Down and Bottom-Up Cues , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  Ali Farhadi,et al.  Attribute-centric recognition for cross-category generalization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[76]  Cristian Sminchisescu,et al.  Object recognition as ranking holistic figure-ground hypotheses , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[77]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[78]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[79]  E. Dassau,et al.  Closing the loop , 2012, International journal of clinical practice. Supplement.

[80]  D. A. Huffman,et al.  Impossible Objects as Nonsense Sentences , 2012 .