An object-based compression system for a class of dynamic image-based representations

This paper proposes a new object-based compression system for a class of dynamic image-based representations called plenoptic videos (PVs). PVs are simplified dynamic light fields, where the videos are taken at regularly spaced locations along line segments instead of a 2-D plane. The proposed system employs an object-based approach, where objects at different depth values are segmented to improve the rendering quality as in the pop-up light fields. Furthermore, by coding the plenoptic video at the object level, desirable functionalities such as scalability of contents, error resilience, and interactivity with individual IBR objects can be achieved. Besides supporting the coding of the texture and binary shape maps for IBR objects with arbitrary shapes, the proposed system also supports the coding of gray-scale alpha maps as well as geometry information in the form of depth maps to respectively facilitate the matting and rendering of the IBR objects. To improve the coding performance, the proposed compression system exploits both the temporal redundancy and spatial redundancy among the video object streams in the PV by employing disparity-compensated prediction or spatial prediction in its texture, shape and depth coding processes. To demonstrate the principle and effectiveness of the proposed system, a multiple video camera system was built and experimental results show that considerable improvements in coding performance are obtained for both synthetic scene and real scene, while supporting the stated object-based functionalities.

[1]  Sotiris Malassiotis,et al.  Object-based coding of stereoscopic and 3D image sequences , 1999, IEEE Signal Process. Mag..

[2]  Harry Shum,et al.  Plenoptic sampling , 2000, SIGGRAPH.

[3]  Harry Shum,et al.  Pop-up light field: An interactive image-based modeling and rendering system , 2004, TOGS.

[4]  K. H. Barratt Digital Coding of Waveforms , 1985 .

[5]  Marc Levoy,et al.  Light field rendering , 1996, SIGGRAPH.

[6]  Harry Shum,et al.  The compression of simplified dynamic light fields , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7]  S. B. Kang,et al.  Survey of image-based representations and compression techniques , 2003, IEEE Trans. Circuits Syst. Video Technol..

[8]  Touradj Ebrahimi,et al.  A simple and efficient binary shape coding technique based on bitmap representation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Harry Shum,et al.  Lazy snapping , 2004, ACM Trans. Graph..

[10]  E. Adelson,et al.  The Plenoptic Function and the Elements of Early Vision , 1991 .

[11]  Harry Shum,et al.  The plenoptic videos: capturing, rendering and compression , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[12]  Iso/iec 14496-2 Information Technology — Coding of Audio-visual Objects — Part 2: Visual , 2022 .

[13]  Richard Szeliski,et al.  The lumigraph , 1996, SIGGRAPH.

[14]  M. Lukacs,et al.  Predictive coding of multi-viewpoint image sets , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.