Optimal Image and Video Closure by Superpixel Grouping

Detecting independent objects in images and videos is an important perceptual grouping problem. One common perceptual grouping cue that can facilitate this objective is the cue of contour closure, reflecting the spatial coherence of objects in the world and their projections as closed boundaries separating figure from background. Detecting contour closure in images consists of finding a cycle of disconnected contour fragments that separates an object from its background. Searching the entire space of possible groupings is intractable, and previous approaches have adopted powerful perceptual grouping heuristics, such as proximity and co-curvilinearity, to constrain the search. We introduce a new formulation of the problem, by transforming the problem of finding cycles of contour fragments to finding subsets of superpixels whose collective boundary has strong edge support (few gaps) in the image. Our cost function, a ratio of a boundary gap measure to area, promotes spatially coherent sets of superpixels. Moreover, its properties support a global optimization procedure based on parametric maxflow. Extending closure detection to videos, we introduce the concept of spatiotemporal closure. Analogous to image closure, we formulate our spatiotemporal closure cost over a graph of spatiotemporal superpixels. Our cost function is a ratio of motion and appearance discontinuity measures on the boundary of the selection to an internal homogeneity measure of the selected spatiotemporal volume. The resulting approach automatically recovers coherent components in images and videos, corresponding to objects, object parts, and objects with surrounding context, providing a good set of multiscale hypotheses for high-level scene analysis. We evaluate both our image and video closure frameworks by comparing them to other closure detection approaches, and find that they yield improved performance.

[1]  Sven J. Dickinson,et al.  Multiscale Symmetric Part Detection and Grouping , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Tat-Jen Cham,et al.  Geometric Saliency of Curve Correspondances and Grouping of Symmetric Comntours , 1996, ECCV.

[3]  Murat Kunt,et al.  Spatiotemporal Segmentation Based on Region Merging , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Steven W. Zucker,et al.  Computing Contour Closure , 1996, ECCV.

[5]  Sven J. Dickinson,et al.  TurboPixels: Fast Superpixels Using Geometric Flows , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Daniel DeMenthon,et al.  SPATIO-TEMPORAL SEGMENTATION OF VIDEO BY HIERARCHICAL MEAN SHIFT ANALYSIS , 2002 .

[7]  Sven J. Dickinson,et al.  Integrating region and boundary information for spatiallycoherent object tracking , 2006, Image Vis. Comput..

[8]  Toby Sharp,et al.  Image segmentation with a bounding box prior , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Shimon Ullman,et al.  Class-Specific, Top-Down Segmentation , 2002, ECCV.

[10]  A. Jepson,et al.  Perceptual grouping for contour extraction , 2004, ICPR 2004.

[11]  Ian H. Jermyn,et al.  Globally Optimal Regions and Boundaries as Minimum Ratio Weight Cycles , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Hayit Greenspan,et al.  Probabilistic space-time video modeling via piecewise GMM , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[14]  Jitendra Malik,et al.  Scale-invariant contour completion using conditional random fields , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Marie-Pierre Jolly,et al.  Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images , 2001, ICCV.

[16]  Antti Ylä-Jääski,et al.  Grouping Symmetrical Structures for Object Segmentation and Description , 1996, Comput. Vis. Image Underst..

[17]  M. Wertheimer Laws of organization in perceptual forms. , 1938 .

[18]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  Martial Hebert,et al.  Learning to Find Object Boundaries Using Motion Cues , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[20]  Werner Dinkelbach On Nonlinear Fractional Programming , 1967 .

[21]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Rachid Deriche,et al.  Region tracking through image sequences , 1995, Proceedings of IEEE International Conference on Computer Vision.

[24]  Sven J. Dickinson,et al.  Spatiotemporal Closure , 2010, ACCV.

[25]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[26]  Jitendra Malik,et al.  Cue Integration for Figure/Ground Labeling , 2005, NIPS.

[27]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[28]  Ioannis Patras,et al.  Video Segmentation by MAP Labeling of Watershed Segments , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Yael Pritch,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008 1 Non-Chronological Video , 2022 .

[30]  Yair Weiss,et al.  Smoothness in layers: Motion segmentation using nonparametric mixture estimation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Allan D. Jepson,et al.  Robust Boundary DetectionWith Adaptive Grouping , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[33]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[34]  Ronen Basri,et al.  Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[36]  Demin Wang Unsupervised video segmentation based on watersheds and temporal tracking , 1998, IEEE Trans. Circuits Syst. Video Technol..

[37]  Daniel Cremers,et al.  Motion Competition: A Variational Approach to Piecewise Parametric Motion Segmentation , 2005, International Journal of Computer Vision.

[38]  James Elder,et al.  A measure of closure , 1994, Vision Research.

[39]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[40]  David J. Fleet,et al.  A Layered Motion Representation with Occlusion and Compact Spatial Support , 2002, ECCV.

[41]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[42]  Greg Welch,et al.  An Introduction to Kalman Filter , 1995, SIGGRAPH 2001.

[43]  Song Wang,et al.  Edge Grouping Combining Boundary and Region Information , 2007, IEEE Transactions on Image Processing.

[44]  M. Brady,et al.  Smoothed Local Symmetries and Their Implementation , 1984 .

[45]  Patrick Bouthemy,et al.  A region-level motion-based graph representation and labeling for tracking a spatial image partition , 2000, Pattern Recognit..

[46]  Sven J. Dickinson,et al.  Multiscale symmetric part detection and grouping , 2009, ICCV.

[47]  Edward H. Adelson,et al.  A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[49]  James L. McClelland,et al.  B-Spline Contour Representation and Symmetry Detection , 1993 .

[50]  Dimitris N. Metaxas,et al.  ]Video object segmentation by hypergraph cut , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Vladimir Kolmogorov,et al.  Applications of parametric maxflow in computer vision , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[53]  Gang Song,et al.  Untangling Cycles for Contour Grouping , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[54]  Sven J. Dickinson,et al.  Optimal Contour Closure by Superpixel Grouping , 2010, ECCV.

[55]  David W. Jacobs,et al.  Robust and Efficient Detection of Salient Convex Groups , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[56]  Lance R. Williams,et al.  Stochastic Completion Fields: A Neural Model of Illusory Contour Shape and Salience , 1995, Neural Computation.

[57]  Song Wang,et al.  Globally Optimal Grouping for Symmetric Closed Boundaries by Combining Boundary and Region Information , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[59]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Cristian Sminchisescu,et al.  Constrained parametric min-cuts for automatic object segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[61]  Allan D. Jepson,et al.  Robust Boundary Detection With Adaptive Grouping , 2006 .

[62]  Cristian Sminchisescu,et al.  Object recognition as ranking holistic figure-ground hypotheses , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[63]  Jitendra Malik,et al.  Efficient spatiotemporal grouping using the Nystrom method , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[64]  Rachid Deriche,et al.  Geodesic Active Contours and Level Sets for the Detection and Tracking of Moving Objects , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Daniel DeMenthon,et al.  A Survey of Spatio-Temporal Grouping Techniques , 2002 .

[66]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[67]  Allen R. Hanson,et al.  Perceptual completion of occluded surfaces , 1996, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[68]  Vladimir Kolmogorov,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..

[69]  Jun Wang,et al.  Salient closed boundary extraction with ratio contour , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.