HEMS: Hierarchical Exemplar-Based Matching-Synthesis for Object-Aware Image Reconstruction

Motivated by the attention on salient objects, conventional region-of-interest (ROI)-based image coding approaches attempt to assign more bits to ROIs and fewer bits to other regions. Thus, the perceptual quality of salient object regions is improved by sacrificing the quality of non-ROI regions with unpleasant artifacts. To address this issue, we concentrate on the efficient compression of object-centered images by encoding salient objects and background features separately. To fully recover the object and background, we propose a hierarchical exemplar-based matching-synthesis (HEMS) approach to reconstruct the image from exemplars. In the proposed framework, once the salient object regions are encoded, only the quantized color features and local descriptors of the background are kept, achieving bit-rate reduction. To make it possible and practical to reconstruct background regions, the hierarchical framework is designed in three layers, including relevant image search, patch candidates matching, and distortion optimized image synthesis. In the hierarchical framework, firstly, image search from an external database returns relevant images, limiting the search space to a feasible number of patch candidates. Secondly, patches are matched by color features to select the appropriate candidates. Finally, the distortion optimized image synthesis further makes it possible to automatically choose the most suitable texture sample, and seamlessly reconstruct the image. Compared to the conventional ROI-based image coding schemes, the proposed approach can achieve better visual quality on both ROI and background regions.

[1]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[2]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[3]  Anup Basu,et al.  Prioritized region of interest coding in JPEG2000 , 2004, ICPR 2004.

[4]  Irfan A. Essa,et al.  Graphcut textures: image and video synthesis using graph cuts , 2003, ACM Trans. Graph..

[5]  Hans-Peter Seidel,et al.  Image Compression with Anisotropic Diffusion , 2008, Journal of Mathematical Imaging and Vision.

[6]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[7]  Zhiwei Xiong,et al.  Block-Based Image Compression With Parameter-Assistant Inpainting , 2010, IEEE Transactions on Image Processing.

[8]  Lihi Zelnik-Manor,et al.  What Makes a Patch Distinct? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, CVPR.

[11]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[12]  Keqiu Li,et al.  A Low Transmission Overhead Framework of Mobile Visual Search Based on Vocabulary Decomposition , 2014, IEEE Transactions on Multimedia.

[13]  Dong Liu,et al.  Image Compression With Edge-Based Inpainting , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  D. Marpe,et al.  The H.264/MPEG4 advanced video coding standard and its applications , 2006, IEEE Communications Magazine.

[15]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[16]  D. V. van Essen,et al.  A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[17]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[18]  Changsheng Xu,et al.  Interaction Design for Mobile Visual Search , 2013, IEEE Transactions on Multimedia.

[19]  William A. Pearlman,et al.  A new, fast, and efficient image codec based on set partitioning in hierarchical trees , 1996, IEEE Trans. Circuits Syst. Video Technol..

[20]  C. Christopoulos,et al.  Efficient methods for encoding regions of interest in the upcoming JPEG2000 still image coding standard , 2000, IEEE Signal Processing Letters.

[21]  John K. Tsotsos,et al.  Modeling Visual Attention via Selective Tuning , 1995, Artif. Intell..

[22]  Guoliang Fan,et al.  A new JPEG2000 region-of-interest image coding method: partial significant bitplanes shift , 2003, IEEE Signal Processing Letters.

[23]  David S. Taubman,et al.  High performance scalable image compression with EBCOT , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[24]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[25]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[26]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Sung Yong Shin,et al.  On pixel-based texture synthesis by non-parametric sampling , 2006, Comput. Graph..

[28]  Zeev Farbman,et al.  Coordinates for instant image cloning , 2009, ACM Trans. Graph..

[29]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1992 .

[30]  Jan-Michael Frahm,et al.  Cloud-scale Image Compression Through Content Deduplication , 2014, BMVC.

[31]  Ce Liu,et al.  Unsupervised Joint Object Discovery and Segmentation in Internet Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[33]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[34]  Yang Li,et al.  Dictionary Learning for Image Coding Based on Multisample Sparse Representation , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[36]  Zhou Wang,et al.  Embedded foveation image coding , 2001, IEEE Trans. Image Process..

[37]  Murat Kunt,et al.  Wavelet-based color image compression: exploiting the contrast sensitivity function , 2003, IEEE Trans. Image Process..

[38]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[39]  K. K. More,et al.  Interactive Multimodal Visual Search on Mobile Device , 2015 .

[40]  Patrick Pérez,et al.  Object removal by exemplar-based inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[41]  Xiaoyan Sun,et al.  Cloud-Based Image Coding for Mobile Devices—Toward Thousands to One Compression , 2013, IEEE Transactions on Multimedia.

[42]  Thomas Wiegand,et al.  Perception-oriented Video Coding based on Image Analysis and Completion: A Review , 2011 .

[43]  William T. Freeman,et al.  Example-Based Super-Resolution , 2002, IEEE Computer Graphics and Applications.

[44]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[45]  Qi Tian,et al.  $\mathcal {L}_p$ -Norm IDF for Scalable Image Retrieval , 2014, IEEE Transactions on Image Processing.

[46]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .