End-to-End Facial Image Compression with Integrated Semantic Distortion Metric

High efficient facial image compression is broadly required and challenging for surveillance and security scenarios, while either traditional general image codecs or special facial image compression schemes only heuristically refine codec separately according to face verification accuracy metric. We propose an End-to-End Facial Image Compression (E2EFIC) framework with a novel variable block size Regionally Adaptive Pooling (RAP) module whose parameters can be automatically optimized according to gradient feedback from an integrated semantic distortion metrics, including a successful exploration to apply Generative Adversarial Network (GAN) as metric directly in image compression scheme. The experimental results verify the framework’s efficiency by demonstrating performance improvement of 71.41%, 48.28% and 52.67% bitrate saving separately over JPEG2000, WebP and neural network-based codecs under the same face verification accuracy distortion metric. We also evaluate E2EFIC’s superior performance gain compared with latest specific facial image codecs.

[1]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Michael Elad,et al.  Compression of facial images using the K-SVD algorithm , 2008, J. Vis. Commun. Image Represent..

[4]  Lubomir D. Bourdev,et al.  Real-Time Adaptive Image Compression , 2017, ICML.

[5]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[6]  Feng Wu,et al.  Learning for Video Compression , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Dimche Kostadinov,et al.  Sparse Multi-layer Image Approximation: Facial Image Compression , 2015, ArXiv.

[8]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[9]  David Minnen,et al.  Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jyrki Alakuijala,et al.  Guetzli: Perceptually Guided JPEG Encoder , 2017, ArXiv.

[11]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Dong Liu,et al.  Recognizable or Not: Towards Image Semantic Quality Assessment for Compression , 2017 .

[13]  Michael Elad,et al.  Facial Image Compression using Patch-Ordering-Based Adaptive Wavelet Transform , 2014, IEEE Signal Processing Letters.

[14]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[15]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[16]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[17]  David Minnen,et al.  Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[18]  James A. Storer,et al.  Semantic Perceptual Image Compression Using Deep Convolution Networks , 2016, 2017 Data Compression Conference (DCC).

[19]  Mislav Grgic,et al.  Image Compression in Face Recognition - a Literature Survey , 2008 .

[20]  Michael Elad,et al.  Low Bit-Rate Compression of Facial Images , 2007, IEEE Transactions on Image Processing.