论文信息 - Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics

Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics

Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this article, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts.1 Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well.1https://lists.aau.at/mailman/listinfo/mpeg-vcm

[1] Wenjun Zeng,et al. Spatio-Temporal Attention-Based LSTM Networks for 3D Action Recognition and Detection , 2018, IEEE Transactions on Image Processing.

[2] Yong Luo,et al. Toward Intelligent Visual Sensing and Low-cost Analysis: A Collaborative Computing Approach , 2019, 2019 IEEE Visual Communications and Image Processing (VCIP).

[3] Ling-Yu Duan,et al. Compact Descriptors for Video Analysis: The Emerging MPEG Standard , 2017, IEEE MultiMedia.

[4] Wen Gao,et al. Front-End Smart Visual Sensing and Back-End Intelligent Analysis: A Unified Infrastructure for Economizing the Visual System of City Brain , 2019, IEEE Journal on Selected Areas in Communications.

[5] Wen Gao,et al. Enhanced Ctu-Level Inter Prediction with Deep Frame Rate Up-Conversion for High Efficiency Video Coding , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[6] Wenhan Yang,et al. Dmcnn: Dual-Domain Multi-Scale Convolutional Neural Network for Compression Artifacts Removal , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[7] Zhizheng Zhang,et al. Deep Scalable Image Compression via Hierarchical Feature Decorrelation , 2019, 2019 Picture Coding Symposium (PCS).

[8] Tiejun Huang,et al. Spike Coding: Towards Lossy Compression for Dynamic Vision Sensor , 2019, 2019 Data Compression Conference (DCC).

[9] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[10] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[11] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.

[12] Luc Van Gool,et al. Disentangled Person Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13] Bernd Girod,et al. Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[14] Wenjun Zeng,et al. Multi-Modality Multi-Task Recurrent Neural Network for Online Action Detection , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[15] Wen Gao,et al. Digital retina: revolutionizing camera systems for the smart city , 2018 .

[16] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[17] Seunghoon Hong,et al. Decomposing Motion and Content for Natural Video Sequence Prediction , 2017, ICLR.

[18] Wen Gao,et al. Enhanced Motion-Compensated Video Coding With Deep Virtual Reference Frame Generation , 2019, IEEE Transactions on Image Processing.

[19] DarrellTrevor,et al. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description , 2017 .

[20] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Jiajun Wu,et al. Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks , 2016, NIPS.

[22] Jian Yang,et al. MemNet: A Persistent Memory Network for Image Restoration , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23] Shuicheng Yan,et al. Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution , 2016, IEEE Transactions on Image Processing.

[24] Ruben Villegas,et al. Learning to Generate Long-term Future via Hierarchical Prediction , 2017, ICML.

[25] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[27] Itu-T. Video coding for low bitrate communication , 1996 .

[28] Xinfeng Zhang,et al. Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding , 2019, IEEE Transactions on Image Processing.

[29] Valero Laparra,et al. End-to-end optimization of nonlinear transform codes for perceptual quality , 2016, 2016 Picture Coding Symposium (PCS).

[30] Luc Van Gool,et al. Pose Guided Person Image Generation , 2017, NIPS.

[31] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[33] Massimo Balestri,et al. Selection of local features for visual search , 2013, Signal Process. Image Commun..

[34] Ivan V. Bajic,et al. Near-Lossless Deep Feature Compression for Collaborative Intelligence , 2018, 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP).

[35] Yong Luo,et al. Towards Digital Retina in Smart Cities: A Model Generation, Utilization and Communication Paradigm , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[36] Bin Li,et al. Efficient Multiple-Line-Based Intra Prediction for HEVC , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[37] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Ling-Yu Duan,et al. Towards Coding For Human And Machine Vision: A Scalable Image Coding Approach , 2020, 2020 IEEE International Conference on Multimedia and Expo (ICME).

[39] Ling-Yu Duan,et al. An Emerging Coding Paradigm Vcm: A Scalable Coding Approach Beyond Feature And Signal , 2020, 2020 IEEE International Conference on Multimedia and Expo (ICME).

[40] David Minnen,et al. Variable Rate Image Compression with Recurrent Neural Networks , 2015, ICLR.

[41] Ruben Villegas,et al. Hierarchical Long-term Video Prediction without Supervision , 2018, ICML.

[42] Shuicheng Yan,et al. Deep Joint Rain Detection and Removal from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Wen Gao,et al. Location Discriminative Vocabulary Coding for Mobile Landmark Search , 2011, International Journal of Computer Vision.

[44] Martial Hebert,et al. An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[45] Yun Fu,et al. Image Super-Resolution Using Very Deep Residual Channel Attention Networks , 2018, ECCV.

[46] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[47] Alexandre Alahi,et al. PifPaf: Composite Fields for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[49] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50] Wen Gao,et al. Scalable Facial Image Compression with Deep Feature Reconstruction , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[51] Itu-T and Iso Iec Jtc. Advanced video coding for generic audiovisual services , 2010 .

[52] Jung-Woo Ha,et al. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54] Shunta Saito,et al. Temporal Generative Adversarial Nets with Singular Value Clipping , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[55] Guner S. Robinson,et al. Combined Spatial And Temporal Coding Of Digital Image Sequences , 1975, Optics & Photonics.

[56] Wen Gao,et al. AVS2 ? Making Video Coding Smarter [Standards in a Nutshell] , 2015, IEEE Signal Processing Magazine.

[57] Rob Fergus,et al. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[58] Eckehard G. Steinbach,et al. Keypoint Encoding for Improved Feature Extraction From Compressed Video at Low Bitrates , 2015, IEEE Transactions on Multimedia.

[59] Siwei Ma,et al. A Group Variational Transformation Neural Network for Fractional Interpolation of Video Coding , 2018, 2018 Data Compression Conference.

[60] Weiping Li,et al. Overview of fine granularity scalability in MPEG-4 video standard , 2001, IEEE Trans. Circuits Syst. Video Technol..

[61] Wen Gao,et al. Fast MPEG-CDVS Encoder With GPU-CPU Hybrid Computing , 2017, IEEE Transactions on Image Processing.

[62] Bin Li,et al. Convolutional Neural Network-Based Invertible Half-Pixel Interpolation Filter for Video Coding , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[63] Xinfeng Zhang,et al. Joint Feature and Texture Coding: Toward Smart Video Representation via Front-End Intelligence , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[64] David Minnen,et al. Variational image compression with a scale hyperprior , 2018, ICLR.

[65] Ling-Yu Duan,et al. Lossy Intermediate Deep Learning Feature Compression and Evaluation , 2019, ACM Multimedia.

[66] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.

[67] Hongfei Fan,et al. Rate-Performance-Loss Optimization for Inter-Frame Deep Feature Coding From Videos , 2017, IEEE Transactions on Image Processing.

[68] Massimo Balestri,et al. Accurate and Efficient Visual Search on Embedded Systems , 2015 .

[69] Yiming Li,et al. Dense Residual Convolutional Neural Network based In-Loop Filter for HEVC , 2018, 2018 IEEE Visual Communications and Image Processing (VCIP).

[70] Dong Liu,et al. One-for-All: Grouped Variation Network-Based Fractional Interpolation in Video Coding , 2019, IEEE Transactions on Image Processing.

[71] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[72] Wenhan Yang,et al. Partition Tree Guided Progressive Rethinking Network for in-Loop Filtering of HEVC , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[73] Jan Kautz,et al. MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[74] Wenhan Yang,et al. Deep Reference Generation With Multi-Domain Hierarchical Constraints for Inter Prediction , 2019, IEEE Transactions on Multimedia.

[75] Wen Gao,et al. Towards low bit rate mobile visual search with multiple-channel coding , 2011, ACM Multimedia.

[76] Tao Mei,et al. Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[77] Guangming Shi,et al. Joint Demosaicing and Denoising with Perceptual Optimization on a Generative Adversarial Network , 2018, ArXiv.

[78] Hyunsoo Kim,et al. Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[79] David Zhang,et al. Learning Convolutional Networks for Content-Weighted Image Compression , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[80] Bernd Girod,et al. Memory-Efficient Image Databases for Mobile Visual Search , 2014, IEEE MultiMedia.

[81] Valero Laparra,et al. Density Modeling of Images using a Generalized Normalization Transformation , 2015, ICLR.

[82] Bernt Schiele,et al. Learning What and Where to Draw , 2016, NIPS.

[83] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[84] Nicu Sebe,et al. Animating Arbitrary Objects via Deep Motion Transfer , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[85] Chao Dong,et al. Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[86] Valero Laparra,et al. End-to-end Optimized Image Compression , 2016, ICLR.

[87] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[88] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[89] Zhizheng Zhang,et al. Learned Scalable Image Compression with Bidirectional Context Disentanglement Network , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[90] Sandra Aigner,et al. FUTUREGAN: ANTICIPATING THE FUTURE FRAMES OF VIDEO SEQUENCES USING SPATIO-TEMPORAL 3D CONVOLUTIONS IN PROGRESSIVELY GROWING GANS , 2018, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[91] Nicu Sebe,et al. Deformable GANs for Pose-Based Human Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[92] Gary J. Sullivan,et al. Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[93] Ping Tan,et al. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[94] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[95] Jiri Matas,et al. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[96] Wenhan Yang,et al. Progressive Spatial Recurrent Neural Network for Intra Prediction , 2018, IEEE Transactions on Multimedia.

[97] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[98] Sergey Levine,et al. Stochastic Adversarial Video Prediction , 2018, ArXiv.

[99] Yun Fu,et al. Residual Dense Network for Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[100] Munchurl Kim,et al. CNN-based in-loop filtering for coding efficiency improvement , 2016, 2016 IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP).

[101] Bernd Girod,et al. Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[102] Dong Liu,et al. A CNN-Based In-Loop Filter with CU Classification for HEVC , 2018, 2018 IEEE Visual Communications and Image Processing (VCIP).

[103] Vishal M. Patel,et al. Density-Aware Single Image De-raining Using a Multi-stream Dense Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[104] Ling-yu Duan,et al. Rate-adaptive Compact Fisher Codes for Mobile Visual Search , 2014, IEEE Signal Processing Letters.

[105] Zhan Ma,et al. Practical Stacked Non-local Attention Modules for Image Compression , 2019, CVPR Workshops.

[106] Wen Gao,et al. Toward Knowledge as a Service Over Networks: A Deep Learning Model Communication Paradigm , 2019, IEEE Journal on Selected Areas in Communications.

[107] Bernd Girod,et al. Compressed Histogram of Gradients: A Low-Bitrate Descriptor , 2011, International Journal of Computer Vision.

[108] Luc Van Gool,et al. Dynamic Filter Networks , 2016, NIPS.

[109] Jiaying Liu,et al. PKU-MMD: A Large Scale Benchmark for Skeleton-Based Human Action Understanding , 2017, VSCC '17.

[110] A. Habibi. Hybrid Coding of Pictorial Data , 1974, IEEE Trans. Commun..

[111] Frédo Durand,et al. Synthesizing Images of Humans in Unseen Poses , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[112] Ajay Luthra,et al. Overview of the H.264/AVC video coding standard , 2003, IEEE Trans. Circuits Syst. Video Technol..

[113] Miroslaw Bober,et al. MPEG-7 visual shape descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[114] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[115] Wen Gao,et al. HNIP: Compact Deep Invariant Representations for Video Matching, Localization, and Retrieval , 2017, IEEE Transactions on Multimedia.

[116] Francesc Moreno-Noguer,et al. Unsupervised Person Image Synthesis in Arbitrary Poses , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[117] Tao Mei,et al. Share-and-Chat: Achieving Human-Level Video Commenting by Search and Multi-View Embedding , 2016, ACM Multimedia.

[118] Aline Roumy,et al. Context-Adaptive Neural Network-Based Prediction for Image Compression , 2018, IEEE Transactions on Image Processing.

[119] Sergey Levine,et al. Stochastic Variational Video Prediction , 2017, ICLR.

[120] Gary J. Sullivan,et al. Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[121] Jiaying Liu,et al. A Benchmark Dataset and Comparison Study for Multi-modal Human Action Analytics , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[122] Bernd Girod,et al. Location coding for mobile image retrieval , 2009, MobiMedia.

[123] Yao Wang,et al. End-to-End Learnt Image Compression via Non-Local Attention Optimization and Improved Context Modeling , 2019, IEEE Transactions on Image Processing.

[124] Zhenyu Liu,et al. CU Partition Mode Decision for HEVC Hardwired Intra Encoder Using Convolution Neural Network , 2016, IEEE Transactions on Image Processing.

[125] Marc'Aurelio Ranzato,et al. Transformation-Based Models of Video Sequences , 2017, ArXiv.

[126] Gabriel Kreiman,et al. Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.

[127] Narendra Ahuja,et al. Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[128] Bernd Girod,et al. Improved coding for image feature location information , 2012, Other Conferences.

[129] Yong Luo,et al. Data-Driven Lightweight Interest Point Selection for Large-Scale Visual Search , 2018, IEEE Transactions on Multimedia.

[130] Björn Ommer,et al. A Variational U-Net for Conditional Appearance and Shape Generation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[131] Rob Fergus,et al. Stochastic Video Generation with a Learned Prior , 2018, ICML.

[132] Wen Gao,et al. AVS2—Making Video Coding Smarter , 2015 .

[133] Weisi Lin,et al. Toward Intelligent Sensing: Intermediate Deep Feature Compression , 2020, IEEE Transactions on Image Processing.

[134] Ling-Yu Duan,et al. Codebook-Free Compact Descriptor for Scalable Visual Search , 2019, IEEE Transactions on Multimedia.

[135] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[136] Wenmin Wang,et al. Video Imagination from a Single Image with Transformation Generation , 2017, ACM Multimedia.

[137] Tiejun Huang,et al. An Efficient Coding Method for Spike Camera Using Inter-Spike Intervals , 2019, 2019 Data Compression Conference (DCC).

[138] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[139] Zhu Li,et al. Robust emotion recognition from low quality and low bit rate video: A deep learning approach , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).

[140] Tiejun Huang,et al. Spike Coding for Dynamic Vision Sensor in Intelligent Driving , 2019, IEEE Internet of Things Journal.

[141] David Minnen,et al. Full Resolution Image Compression with Recurrent Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[142] Zhan Ma,et al. Learned Video Compression via Joint Spatial-Temporal Correlation Exploration , 2019, AAAI.

[143] Fisher Yu,et al. TextureGAN: Controlling Deep Image Synthesis with Texture Patches , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[144] Hongyang Chao,et al. One-To-Many Network for Visually Pleasing Compression Artifacts Reduction , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[145] Heiko Schwarz,et al. Overview of the Scalable Video Coding Extension of the H.264/AVC Standard , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[146] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[147] Alberto Del Bimbo,et al. Deep Generative Adversarial Compression Artifact Removal , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[148] Dongsheng Wang,et al. CNN oriented fast HEVC intra CU mode decision , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[149] K. Rijkse,et al. H.263: video coding for low-bit-rate communication , 1996, IEEE Commun. Mag..

[150] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[151] Zhuo Chen,et al. Toward Intelligent Sensing: Intermediate Deep Feature Compression , 2020, IEEE Transactions on Image Processing.

[152] Touradj Ebrahimi,et al. The JPEG 2000 still image compression standard , 2001, IEEE Signal Process. Mag..