Beyond Coding: Detection-driven Image Compression with Semantically Structured Bit-stream

With the development of 5G and edge computing, it is increasingly important to offload intelligent media computing to edge device. Traditional media coding scheme codes the media into one binary stream without a semantic structure, which prevents many important intelligent applications from operating directly in bit-stream level, including semantic analysis, parsing specific content, media editing, etc. Therefore, in this paper, we propose a learning based Semantically Structured Coding (SSC) framework to generate Semantically Structured Bit-stream (SSB), where each part of bit-stream represents a certain object and can be directly used for aforementioned tasks. Specifically, we integrate an object detection module in our compression framework to locate and align the object in feature domain. After applying quantization and entropy coding, the features are re-organized according to detected and aligned objects to form a bit-stream. Besides, different from existing learning-based compression schemes that individually train models for specific bit-rate, we share most of model parameters among various bit-rates to significantly reduce model size for variable-rate compression. Experimental results demonstrate that only at the cost of negligible overhead, objects can be completely reconstructed from partial bit-stream. We also verified that classification and pose estimation can be directly performed on partial bit-stream without performance degradation.

[1]  Alan C. Bovik,et al.  Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures , 2009, IEEE Signal Processing Magazine.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jason Yosinski,et al.  Faster Neural Networks Straight from JPEG , 2018, NeurIPS.

[4]  Lucas Theis,et al.  Lossy Image Compression with Compressive Autoencoders , 2017, ICLR.

[5]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  David Zhang,et al.  Learning Convolutional Networks for Content-Weighted Image Compression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  G. Bjontegaard,et al.  Calculation of Average PSNR Differences between RD-curves , 2001 .

[9]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[10]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[11]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[12]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Zhibo Chen,et al.  End-to-End Facial Image Compression with Integrated Semantic Distortion Metric , 2018, 2018 IEEE Visual Communications and Image Processing (VCIP).

[14]  Valero Laparra,et al.  End-to-end Optimized Image Compression , 2016, ICLR.

[15]  Vanessa Testoni,et al.  Transmitting What Matters: Task-Oriented Video Composition and Compression , 2016, 2016 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI).

[16]  David Minnen,et al.  Joint Autoregressive and Hierarchical Priors for Learned Image Compression , 2018, NeurIPS.

[17]  Feng Wu,et al.  Learning for Video Compression , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[19]  Luc Van Gool,et al.  Towards Image Understanding from Deep Compression without Decoding , 2018, ICLR.

[20]  James A. Storer,et al.  Semantic Perceptual Image Compression Using Deep Convolution Networks , 2016, 2017 Data Compression Conference (DCC).