Image segmentation with patch-pair density priors

In this paper, we investigate how an unlabeled image corpus can facilitate the segmentation of any given image. A simple yet efficient multi-task joint sparse representation model is presented to augment the patch-pair similarities by harnessing the newly discovered patch-pair density priors. First, each image in over-segmented as a set of patches, and the adjacent patch-pair density priors, statistically calculated from the unlabeled image corpus, bring an intuitively explainable and informative observation that kindred patch-pairs generally have higher densities that inhomogeneous patch-pairs. Then for each adjacent patch-pair within the given image, high-density biased multi-task joint sparse reconstruction is pursued such that 1) both individual patches and patch-pair can be reconstructed with few patch-pairs from the unlabeled image corpus, and 2) the patch-pairs selected for reconstruction are high-density biased, namely, preferring patch-pairs belonging to the same semantic region. In this way, the overall reconstruction residue well conveys the discriminative information on whether these two patches belong to the same semantic region, and consequently the patch affinity matrix is augmented by reconstruction residues for all adjacent patch-pairs within the given image. The ultimate image segmentation is derived by employing the popular normalized cut approach over the augmented patch affinity matrix. Extensive image segmentation experiments over two public databases clearly demonstrate the superiority of the proposed solution over several state-of-the-art algorithms. Furthermore, the algorithmic practicality is well validated with comparison experiments on content-based image retrieval and multi-label image annotation performed over image segmentation outputs.

[1]  G. A. Edgar Measure, Topology, and Fractal Geometry , 1990 .

[2]  Ben Taskar,et al.  Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..

[3]  Thomas Hofmann,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2007 .

[4]  Wenjiang J. Fu Penalized Regressions: The Bridge versus the Lasso , 1998 .

[5]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[9]  Shuicheng Yan,et al.  Semi-supervised Learning by Sparse Representation , 2009, SDM.

[10]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Alexei A. Efros,et al.  Segmenting Scenes by Matching Image Composites , 2009, NIPS.

[12]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[13]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Jitendra Malik,et al.  Color- and texture-based image segmentation using EM and its application to content-based image retrieval , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[15]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Greg Mori,et al.  Guiding model search using segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  Harry Shum,et al.  Image segmentation by data driven Markov chain Monte Carlo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[18]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[19]  Allen Y. Yang,et al.  Unsupervised segmentation of natural images via lossy data compression , 2008, Comput. Vis. Image Underst..

[20]  Hai Jin,et al.  Label to region by bi-layer sparsity priors , 2009, MM '09.

[21]  Rong Jin,et al.  Correlated Label Propagation with Application to Multi-label Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[24]  Zhi-Hua Zhou,et al.  Multi-Instance Multi-Label Learning with Application to Scene Classification , 2006, NIPS.

[25]  Xiaobai Liu,et al.  Label to Region by BiLayer Sparsity Priors , 2009 .

[26]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[27]  Bo Zhang,et al.  Exploiting spatial context constraints for automatic image region annotation , 2007, ACM Multimedia.

[28]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[29]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[30]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .