Ground-truth dataset and baseline evaluations for image base-detail separation algorithms

Base-detail separation is a fundamental computer vision problem consisting of modeling a smooth base layer with the coarse structures, and a detail layer containing the texture-like structures. One of the challenges of estimating the base is to preserve sharp boundaries between objects or parts to avoid halo artifacts. Many methods have been proposed to address this problem, but there is no ground-truth dataset of real images for quantitative evaluation. We proposed a procedure to construct such a dataset, and provide two datasets: Pascal Base-Detail and Fashionista Base-Detail, containing 1000 and 250 images, respectively. Our assumption is that the base is piecewise smooth and we label the appearance of each piece by a polynomial model. The pieces are objects and parts of objects, obtained from human annotations. Finally, we proposed a way to evaluate methods with our base-detail ground-truth and we compared the performances of seven state-of-the-art algorithms.

[1]  Jan Kautz,et al.  Local Laplacian filters: edge-aware image processing with a Laplacian pyramid , 2011, ACM Trans. Graph..

[2]  Wotao Yin,et al.  Image Cartoon-Texture Decomposition and Feature Selection Using the Total Variation Regularized L1 Functional , 2005, VLSM.

[3]  Joost van de Weijer,et al.  Local Mode Filtering , 2001, CVPR.

[4]  Daniel Cremers,et al.  Iterated Nonlocal Means for Texture Restoration , 2007, SSVM.

[5]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Sanja Fidler,et al.  Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[8]  Michael F. Cohen,et al.  GradientShop: A gradient-domain optimization framework for image and video filtering , 2010, TOGS.

[9]  Patrick Pérez,et al.  Geodesic image and video editing , 2010, TOGS.

[10]  Li Xu,et al.  Structure extraction from texture via relative total variation , 2012, ACM Trans. Graph..

[11]  Zeev Farbman,et al.  Edge-preserving decompositions for multi-scale tone and detail manipulation , 2008, ACM Trans. Graph..

[12]  Cewu Lu,et al.  Image smoothing via L0 gradient minimization , 2011, ACM Trans. Graph..

[13]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[14]  Noah Snavely,et al.  Intrinsic images in the wild , 2014, ACM Trans. Graph..

[15]  M. Kass,et al.  Smoothed local histogram filters , 2010, ACM Trans. Graph..

[16]  Jian Sun,et al.  Automatic Exposure Correction of Consumer Photographs , 2012, ECCV.

[17]  Manuel Menezes de Oliveira Neto,et al.  Adaptive manifolds for real-time high-dimensional filtering , 2012, ACM Trans. Graph..

[18]  Edward H. Adelson,et al.  Ground truth dataset and baseline evaluations for intrinsic image algorithms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Szymon Rusinkiewicz,et al.  Multiscale shape and detail enhancement from multi-light image collections , 2007, ACM Trans. Graph..

[20]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[21]  Zeev Farbman,et al.  Diffusion maps for edge-aware image editing , 2010, ACM Trans. Graph..

[22]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Frédo Durand,et al.  Edge-preserving multiscale image decomposition based on local extrema , 2009, ACM Trans. Graph..

[24]  Alexei A. Efros,et al.  Fast bilateral filtering for the display of high-dynamic-range images , 2002 .

[25]  Frédo Durand,et al.  Fast Local Laplacian Filters , 2014, ACM Trans. Graph..

[26]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Qi Zhang,et al.  100+ Times Faster Weighted Median Filter (WMF) , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Luis E. Ortiz,et al.  Parsing clothing in fashion photographs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[30]  Qi Zhang,et al.  Rolling Guidance Filter , 2014, ECCV.

[31]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[32]  Aykut Erdem,et al.  Structure-preserving image smoothing via region covariances , 2013, ACM Trans. Graph..

[33]  Ben Weiss Fast median and bilateral filtering , 2006, SIGGRAPH 2006.