Layer-based sparse representation of multiview images

AbstactThis article presents a novel method to obtain a sparse representation of multiview images. The method is based on the fact that multiview data is composed of epipolar-plane image lines which are highly redundant. We extend this principle to obtain the layer-based representation, which partitions a multiview image dataset into redundant regions (which we call layers) each related to a constant depth in the observed scene. The layers are extracted using a general segmentation framework which takes into account the camera setup and occlusion constraints. To obtain a sparse representation, the extracted layers are further decomposed using a multidimensional discrete wavelet transform (DWT), first across the view domain followed by a two-dimensional (2D) DWT applied to the image dimensions. We modify the viewpoint DWT to take into account occlusions and scene depth variations. Simulation results based on nonlinear approximation show that the sparsity of our representation is superior to the multi-dimensional DWT without disparity compensation. In addition we demonstrate that the constant depth model of the representation can be used to synthesise novel viewpoints for immersive viewing applications and also de-noise multiview images.

[1]  Michael W. Marcellin,et al.  JPEG2000 - image compression fundamentals, standards and practice , 2002, The Kluwer International Series in Engineering and Computer Science.

[2]  Michel Barlaud,et al.  DREAM2S: Deformable Regions Driven by an Eulerian Accurate Minimization Method for Image and Video Segmentation , 2002, ECCV.

[3]  David R. Bull,et al.  Robust H.263+ video for real-time Internet applications , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[4]  Stéphane Mallat,et al.  Bandelet Image Approximation and Compression , 2005, Multiscale Model. Simul..

[5]  Shipeng Li,et al.  Shape-adaptive discrete wavelet transforms for arbitrarily shaped visual object coding , 2000, IEEE Trans. Circuits Syst. Video Technol..

[6]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[7]  Tsuhan Chen,et al.  A survey on image-based rendering - representation, sampling and compression , 2004, Signal Process. Image Commun..

[8]  A. Zoubir,et al.  EURASIP Journal on Advances in Signal Processing , 2011 .

[9]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[10]  David S. Taubman,et al.  Lifting-based invertible motion adaptive transform (LIMAT) framework for highly scalable video compression , 2003, IEEE Trans. Image Process..

[11]  Michael W. Marcellin,et al.  JPEG2000 - image compression fundamentals, standards and practice , 2013, The Kluwer international series in engineering and computer science.

[12]  Ivan W. Selesnick,et al.  On the Dual-Tree Complex Wavelet Packet and $M$-Band Transforms , 2008, IEEE Transactions on Signal Processing.

[13]  Robert C. Bolles,et al.  Epipolar-plane image analysis: An approach to determining structure from motion , 1987, International Journal of Computer Vision.

[14]  Minh N. Do,et al.  The finite ridgelet transform for image representation , 2003, IEEE Trans. Image Process..

[15]  Mike Brookes,et al.  Adaptive layer extraction for image based rendering , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[16]  Baltasar Beferull-Lozano,et al.  Directionlets: anisotropic multidirectional representation with separable filtering , 2006, IEEE Transactions on Image Processing.

[17]  Harry Shum,et al.  Review of image-based rendering techniques , 2000, Visual Communications and Image Processing.

[18]  Minh N. Do,et al.  Ieee Transactions on Image Processing the Contourlet Transform: an Efficient Directional Multiresolution Image Representation , 2022 .

[19]  I. Daubechies,et al.  Factoring wavelet transforms into lifting steps , 1998 .

[20]  Ha T. Nguyen,et al.  Immersive Visual Communication , 2011, IEEE Signal Processing Magazine.

[21]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[22]  Michael Elad,et al.  From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[23]  Vladan Velisavljevic,et al.  Multiview image compression using a layer-based representation , 2010, 2010 IEEE International Conference on Image Processing.

[24]  Thierry Blu,et al.  The SURE-LET Approach to Image Denoising , 2007, IEEE Transactions on Image Processing.

[25]  E. Adelson,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .

[26]  Yiannis Aloimonos,et al.  Shape and the Stereo Correspondence Problem , 2005, International Journal of Computer Vision.

[27]  Vladimir Kolmogorov,et al.  Multi-camera Scene Reconstruction via Graph Cuts , 2002, ECCV.

[28]  Stanley Osher,et al.  Level Set Methods , 2003 .

[29]  Richard Szeliski,et al.  Extracting layers and analyzing their specular properties using epipolar-plane-image analysis , 2005, Comput. Vis. Image Underst..

[30]  Alper Yilmaz,et al.  Level Set Methods , 2007, Wiley Encyclopedia of Computer Science and Engineering.

[31]  Michael Hötter,et al.  Object-oriented analysis-synthesis coding based on moving two-dimensional objects , 1990, Signal Process. Image Commun..

[32]  Thierry Blu,et al.  Mathematical properties of the JPEG2000 wavelet filters , 2003, IEEE Trans. Image Process..

[33]  Harry Shum,et al.  Plenoptic sampling , 2000, SIGGRAPH.

[34]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[35]  Taein Lee,et al.  Active contour models , 2005 .

[36]  Richard Baraniuk,et al.  The Dual-tree Complex Wavelet Transform , 2007 .

[37]  Stéphane Mallat,et al.  Sparse geometric image representations with bandelets , 2005, IEEE Transactions on Image Processing.

[38]  Michel Barlaud,et al.  DREAM2S: Deformable Regions Driven by an Eulerian Accurate Minimization Method for Image and Video Segmentation , 2002, International Journal of Computer Vision.

[39]  Masayuki Tanimoto,et al.  Multiview Imaging and 3DTV , 2007, IEEE Signal Processing Magazine.