Design and Optimization of Graph Transform for Image and Video Compression

The main contribution of this thesis is the introduction of new methods for designing adaptive transforms for image and video compression. Exploiting graph signal processing techniques, we develop new graph construction methods targeted for image and video compression applications. In this way, we obtain a graph that is, at the same time, a good representation of the image and easy to transmit to the decoder. To do so, we investigate different research directions. First, we propose a new method for graph construction that employs innovative edge metrics, quantization and edge prediction techniques. Then, we propose to use a graph learning approach and we introduce a new graph learning algorithm targeted for image compression that defines the connectivities between pixels by taking into consideration the coding of the image signal and the graph topology in rate-distortion term. Moreover, we also present a new superpixel-driven graph transform that uses clusters of superpixel as coding blocks and then computes the graph transform inside each region. In the second part of this work, we exploit graphs to design directional transforms. In fact, an efficient representation of the image directional information is extremely important in order to obtain high performance image and video coding. In this thesis, we present a new directional transform, called Steerable Discrete Cosine Transform (SDCT). This new transform can be obtained by steering the 2D-DCT basis in any chosen direction. Moreover, we can also use more complex steering patterns than a single pure rotation. In order to show the advantages of the SDCT, we present a few image and video compression methods based on this new directional transform. The obtained results show that the SDCT can be efficiently applied to image and video compression and it outperforms the classical DCT and other directional transforms. Along the same lines, we present also a new generalization of the DFT, called Steerable DFT (SDFT). Differently from the SDCT, the SDFT can be defined in one or two dimensions. The 1D-SDFT represents a rotation in the complex plane, instead the 2D-SDFT performs a rotation in the 2D Euclidean space.

[1]  Sumiyuki Koizumi ON THE HILBERT TRANSFORM I , 1959 .

[2]  J.B. Allen,et al.  A unified approach to short-time Fourier analysis and synthesis , 1977, Proceedings of the IEEE.

[3]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Luís B. Almeida An introduction to the angular Fourier transform , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Luís B. Almeida,et al.  The fractional Fourier transform and time-frequency representations , 1994, IEEE Trans. Signal Process..

[6]  R. Merris Laplacian matrices of graphs: a survey , 1994 .

[7]  Peter Kovesi,et al.  Image Features from Phase Congruency , 1995 .

[8]  James H. McClellan,et al.  The discrete rotational Fourier transform , 1996, IEEE Trans. Signal Process..

[9]  Khalid Sayood,et al.  Introduction to Data Compression , 1996 .

[10]  Richard G. Lyons,et al.  Understanding Digital Signal Processing , 1996 .

[11]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[12]  Guillermo Sapiro,et al.  Robust anisotropic diffusion , 1998, IEEE Trans. Image Process..

[13]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[14]  Gary J. Sullivan,et al.  Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[15]  Stéphane Mallat,et al.  Analysis of low bit rate image transform coding , 1998, IEEE Trans. Signal Process..

[16]  R. Merris Laplacian graph eigenvectors , 1998 .

[17]  Gilbert Strang,et al.  The Discrete Cosine Transform , 1999, SIAM Rev..

[18]  David Salomon,et al.  Computer Graphics and Geometric Modeling , 1999, Springer New York.

[19]  G. Unnikrishnan,et al.  Optical encryption by double-random phase encoding in the fractional Fourier domain. , 2000, Optics letters.

[20]  Sanjit K. Mitra,et al.  A novel linear source model and a unified rate control algorithm for H.263/MPEG-2/MPEG-4 , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[21]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[22]  Michael W. Marcellin,et al.  JPEG2000 - image compression fundamentals, standards and practice , 2002, The Kluwer International Series in Engineering and Computer Science.

[23]  Gustavo Carneiro,et al.  Phase-Based Local Features , 2002, ECCV.

[24]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Frank Ruskey,et al.  Bent Hamilton Cycles in d-Dimensional Grid Graphs , 2003, Electron. J. Comb..

[26]  B. Schölkopf,et al.  A Regularization Framework for Learning from Graph Data , 2004, ICML 2004.

[27]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[28]  G. Tee Eigenvectors of block circulant and alternating circulant matrices , 2005 .

[29]  Borut Zalik,et al.  An efficient chain code with Huffman coding , 2005, Pattern Recognit..

[30]  Thomas Sikora,et al.  Trends and Perspectives in Image and Video Coding , 2005, Proceedings of the IEEE.

[31]  Hermilo Sánchez-Cruz,et al.  Compressing bilevel images by means of a three-bit chain code , 2005 .

[32]  Wencheng Wu,et al.  The CIEDE2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations , 2005 .

[33]  Feng Wu,et al.  Lifting-Based Directional DCT-Like Transform for Image Coding , 2007, IEEE Trans. Circuits Syst. Video Technol..

[34]  T. Blumensath,et al.  Iterative Thresholding for Sparse Approximations , 2008 .

[35]  Marta Karczewicz,et al.  Improved h.264 intra coding based on bi-directional intra prediction, directional transform, and adaptive coefficient scanning , 2008, 2008 15th IEEE International Conference on Image Processing.

[36]  Bing Zeng,et al.  Directional Discrete Cosine Transforms—A New Framework for Image Coding , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Bernd Girod,et al.  Direction-adaptive partitioned block transform for image coding , 2008, 2008 15th IEEE International Conference on Image Processing.

[38]  Mohammed Ghanbari,et al.  Scope of validity of PSNR in image/video quality assessment , 2008 .

[39]  Onur G. Guleryuz,et al.  Sparse orthonormal transforms for image compression , 2008, 2008 15th IEEE International Conference on Image Processing.

[40]  Jae S. Lim,et al.  Transforms for the motion compensation residual , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[41]  Pierre Vandergheynst,et al.  Wavelets on Graphs via Spectral Graph Theory , 2009, ArXiv.

[42]  Sven J. Dickinson,et al.  TurboPixels: Fast Superpixels Using Geometric Flows , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Nianjun Liu,et al.  User-driven lossy compression for images and video , 2009, 2009 24th International Conference Image and Vision Computing New Zealand.

[44]  Christine Guillemot,et al.  Sparse optimization with directional DCT bases for image compression , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[45]  Feng Wu,et al.  An overview of directional transforms in image coding , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[46]  Joseph Zambreno,et al.  The secure wavelet transform , 2010, Journal of Real-Time Image Processing.

[47]  Jaejoon Lee,et al.  Edge-adaptive transforms for efficient depth map coding , 2010, 28th Picture Coding Symposium.

[48]  Anthony Vetro,et al.  Direction-adaptive transforms for coding prediction residuals , 2010, 2010 IEEE International Conference on Image Processing.

[49]  Simone Milani,et al.  Segmentation-based motion compensation for enhanced video coding , 2011, 2011 18th IEEE International Conference on Image Processing.

[50]  Fernando Díaz-de-María,et al.  Video encoder based on lifting transforms on graphs , 2011, 2011 18th IEEE International Conference on Image Processing.

[51]  Antonio Ortega,et al.  Lifting Transforms on Graphs for Video Coding , 2011, 2011 Data Compression Conference.

[52]  Anthony Vetro,et al.  Robust Learning of 2-D Separable Transforms for Next-Generation Video Coding , 2011, 2011 Data Compression Conference.

[53]  Susanto Rahardja,et al.  Mode-Dependent Transforms for Coding Directional Intra Prediction Residuals , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[54]  Sunil K. Narang,et al.  Graph-wavelet filterbanks for edge-aware image processing , 2012, 2012 IEEE Statistical Signal Processing Workshop (SSP).

[55]  Sunil K. Narang,et al.  Graph based transforms for depth video coding , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[56]  Wen Gao,et al.  Video Coding With Rate-Distortion Optimized Transform , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[57]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Jie Wang,et al.  VCells: Simple and Efficient Superpixels Using Edge-Weighted Centroidal Voronoi Tessellations , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[60]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[61]  Jean H. Gallier,et al.  Notes on Elementary Spectral Graph Theory. Applications to Graph Clustering Using Normalized Cuts , 2013, ArXiv.

[62]  Debargha Mukherjee,et al.  The latest open-source video codec VP9 - An overview and preliminary results , 2013, 2013 Picture Coding Symposium (PCS).

[63]  Oscar C. Au,et al.  Depth map denoising using graph-based transform and group sparsity , 2013, 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP).

[64]  Cha Zhang,et al.  Analyzing the Optimality of Predictive Transform Coding Using Graph-Based Models , 2013, IEEE Signal Processing Letters.

[65]  Sunil K. Narang,et al.  Critically sampled graph-based wavelet transforms for image coding , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.

[66]  Peyman Milanfar,et al.  A general framework for kernel similarity-based image denoising , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[67]  Madhukar Budagavi,et al.  Core Transform Design in the High Efficiency Video Coding (HEVC) Standard , 2013, IEEE Journal of Selected Topics in Signal Processing.

[68]  Abderrahim Elmoataz,et al.  Lifting scheme on graphs with application to image representation , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[69]  Antonio Ortega,et al.  A graph-based joint bilateral approach for depth enhancement , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[70]  Gene Cheung,et al.  Arbitrarily Shaped Motion Prediction for Depth Video Compression Using Arithmetic Edge Coding , 2014, IEEE Transactions on Image Processing.

[71]  Wen Gao,et al.  Progressive Image Denoising Through Hybrid Graph Laplacian Regularization: A Unified Framework , 2014, IEEE Transactions on Image Processing.

[72]  Dinesh Manocha,et al.  SegTC: Fast Texture Compression using Image Segmentation , 2014, High Performance Graphics.

[73]  Peyman Milanfar,et al.  A General Framework for Regularized, Similarity-Based Image Restoration , 2014, IEEE Transactions on Image Processing.

[74]  Oscar C. Au,et al.  Graph-based joint denoising and super-resolution of generalized piecewise smooth images , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[75]  Xianming Liu,et al.  Inter-block consistent soft decoding of JPEG images with sparsity and graph-signal smoothness priors , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[76]  Riccardo Leonardi,et al.  Representation of signals by local symmetry decomposition , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[77]  Insung Ihm,et al.  Many-to-many two-disjoint path covers in cylindrical and toroidal grids , 2015, Discret. Appl. Math..

[78]  Antonio Ortega,et al.  GTT: Graph template transforms with applications to image coding , 2015, 2015 Picture Coding Symposium (PCS).

[79]  Xianming Liu,et al.  Joint denoising and contrast enhancement of images using graph laplacian operator , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[80]  Antonio Ortega,et al.  Intra-Prediction and Generalized Graph Fourier Transform for Image Coding , 2015, IEEE Signal Processing Letters.

[81]  Oscar C. Au,et al.  Multiresolution Graph Fourier Transform for Compression of Piecewise Smooth Images , 2015, IEEE Transactions on Image Processing.

[82]  Marco Grangetto,et al.  Fast Superpixel-Based Hierarchical Approach to Image Segmentation , 2015, ICIAP.

[83]  Antonio Ortega,et al.  Designing sparse graphs via structure tensor for block transform coding of images , 2015, 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[84]  Yücel Altunbasak,et al.  Approximation and Compression With Sparse Orthonormal Transforms , 2015, IEEE Transactions on Image Processing.

[85]  Antonio Ortega,et al.  Graph-based transforms for inter predicted video coding , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[86]  Antonio Ortega,et al.  Edge-adaptive depth map coding with lifting transform on graphs , 2015, 2015 Picture Coding Symposium (PCS).

[87]  Oscar C. Au,et al.  Optimal graph laplacian regularization for natural image denoising , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[88]  Gene Cheung,et al.  Graph Laplacian Regularization for Inverse Imaging: Analysis in the Continuous Domain , 2016, ArXiv.

[89]  Vassilis Kalofolias,et al.  How to Learn a Graph from Smooth Signals , 2016, AISTATS.

[90]  Kwok-Wo Wong,et al.  Bi-level Protected Compressive Sampling , 2016, IEEE Transactions on Multimedia.

[91]  Antonio Ortega,et al.  Generalized Laplacian precision matrix estimation for graph signal processing , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[92]  Antonio Ortega,et al.  Symmetric line graph transforms for inter predictive video coding , 2016, 2016 Picture Coding Symposium (PCS).

[93]  Pascal Frossard,et al.  Learning Laplacian Matrix in Smooth Graph Signal Representations , 2014, IEEE Transactions on Signal Processing.

[94]  Antonio Ortega,et al.  Graph-based lifting transform for intra-predicted video coding , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[95]  Thomas Davies,et al.  The Thor Video Codec , 2016, 2016 Data Compression Conference (DCC).

[96]  Enrico Magli,et al.  Analysis of One-Time Random Projections for Privacy Preserving Compressed Sensing , 2016, IEEE Transactions on Information Forensics and Security.

[97]  Mathias Wien,et al.  Segmentation-based partitioning for motion compensated prediction in video coding , 2016, 2016 Picture Coding Symposium (PCS).

[98]  Gene Cheung,et al.  Graph-based Dequantization of Block-Compressed Piecewise Smooth Images , 2016, IEEE Signal Processing Letters.

[99]  Marco Grangetto,et al.  Efficient representation of segmentation contours using chain codes , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[100]  Jesús Cid-Sueiro,et al.  Directional Transforms for Video Coding Based on Lifting on Graphs , 2018, IEEE Transactions on Circuits and Systems for Video Technology.