Human-Perception-Oriented Pseudo Analog Video Transmissions With Deep Learning

Recently, pseudo analog transmission has gained increasing attentions due to its ability to alleviate the cliff effect in video multicast scenarios. The existing pseudo analog systems are optimized under the minimum mean squared error criterion. However, their power allocation strategies do not take the perceptual video quality into consideration. In this article, we propose a human-perception-based pseudo analog video transmission system named ROIC-Cast, which aims to intelligently enhance the transmission quality of the region-of-interest (ROI) parts. Firstly, the classic deep learning based saliency detection algorithm is adopted to decompose the continuous video sequences into ROI and non-ROI blocks. Secondly, an effective compression method is used to reduce the data amount of side information generated by the ROI extraction module. Then, the power allocation scheme is formulated as a convex problem, and the optimal transmission power for both ROI and non-ROI blocks is derived in a closed form. Finally, the simulations are conducted to validate the proposed system by comparing with a few of existing systems, e.g., KMV-Cast, SoftCast, and DAC-RAN. The proposed ROIC-Cast can achieve over 4.1 dB peak signal-to-noise ratio gains of ROI compared with other systems, given the channel signal-to-noise ratio as −5 dB, 0 dB, 5 dB, and 10 dB, respectively. This significant performance improvement is due to the automatic ROI extraction, high-efficiency data compression as well as adaptive power allocation.

[1]  Chen Li,et al.  State-of-the-Art in 360° Video/Image Processing: Perception, Assessment and Compression , 2020, IEEE Journal of Selected Topics in Signal Processing.

[2]  Xiao-Wei Tang,et al.  Dynamic Spectrum Access for Multimedia Transmission Over Multi-User, Multi-Channel Cognitive Radio Networks , 2020, IEEE Transactions on Multimedia.

[3]  Rui Zhang,et al.  Intelligent Reflecting Surface-Enhanced OFDM: Channel Estimation and Reflection Optimization , 2019, IEEE Wireless Communications Letters.

[4]  Q. M. Jonathan Wu,et al.  Video Foreground Extraction Using Multi-View Receptive Field and Encoder–Decoder DCNN for Traffic and Surveillance Applications , 2019, IEEE Transactions on Vehicular Technology.

[5]  Agustin Riscos-Núñez,et al.  A Fast Local Algorithm for Track Reconstruction on Parallel Architectures , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[6]  Nei Kato,et al.  An Intelligent Route Computation Approach Based on Real-Time Deep Learning Strategy for Software Defined Communication Systems , 2019, IEEE Transactions on Emerging Topics in Computing.

[7]  Akash Agarwal,et al.  Analysis of Variable Bit Rate SOFDM Transmission Scheme Over Multi-Relay Hybrid Satellite-Terrestrial System in the Presence of CFO and Phase Noise , 2019, IEEE Transactions on Vehicular Technology.

[8]  Mengmou Li,et al.  Generalized Lagrange Multiplier Method and KKT Conditions With an Application to Distributed Optimization , 2019, IEEE Transactions on Circuits and Systems II: Express Briefs.

[9]  Rui Zhang,et al.  3D Trajectory Optimization in Rician Fading for UAV-Enabled Data Harvesting , 2019, IEEE Transactions on Wireless Communications.

[10]  Hatem Boujemaa,et al.  Adaptive Packet Length and MCS Using Average or Instantaneous SNR , 2018, IEEE Transactions on Vehicular Technology.

[11]  Nei Kato,et al.  A Novel Non-Supervised Deep-Learning-Based Network Traffic Control Method for Software Defined Wireless Networks , 2018, IEEE Wireless Communications.

[12]  Hancheng Lu,et al.  Joint Rate and Resource Allocation in Hybrid Digital–Analog Transmission Over Fading Channels , 2018, IEEE Transactions on Vehicular Technology.

[13]  Yan Yan,et al.  Network Coding Aided Collaborative Real-Time Scalable Video Transmission in D2D Communications , 2018, IEEE Transactions on Vehicular Technology.

[14]  Lei Chen,et al.  Online Modeling of Esthetic Communities Using Deep Perception Graph Analytics , 2018, IEEE Transactions on Multimedia.

[15]  Dong-Wook Kim,et al.  High-speed computer-generated hologram based on resource optimization for block-based parallel processing: publisher's note. , 2018, Applied optics.

[16]  Weihua Zhuang,et al.  Traffic Offloading for Online Video Service in Vehicular Networks: A Cooperative Approach , 2018, IEEE Transactions on Vehicular Technology.

[17]  Dong-Wook Kim,et al.  High-speed computer-generated hologram based on resource optimization for block-based parallel processing. , 2018, Applied optics.

[18]  Xiao-Wei Tang,et al.  Spectrum Mapping in Large-Scale Cognitive Radio Networks With Historical Spectrum Decision Results Learning , 2018, IEEE Access.

[19]  Joseph Redmon,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[20]  Yu-Kwong Kwok,et al.  CypherDB: A Novel Architecture for Outsourcing Secure Database Processing , 2018, IEEE Transactions on Cloud Computing.

[21]  Ming-Ming Cheng,et al.  Review of Visual Saliency Detection With Comprehensive Information , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Nei Kato,et al.  Routing or Computing? The Paradigm Shift Towards Intelligent Computer Network Packet Transmission Based on Deep Learning , 2017, IEEE Transactions on Computers.

[23]  Xin-Lin Huang,et al.  Maximum a Posteriori Decoding for KMV-Cast Pseudo-Analog Video Transmission , 2017, Mobile Networks and Applications.

[24]  Ping Wang,et al.  Improved KMV-Cast with BM3D Denoising , 2017, Mobile Networks and Applications.

[25]  Guangming Shi,et al.  Distributed Compressive Sensing for Cloud-Based Wireless Image Transmission , 2017, IEEE Transactions on Multimedia.

[26]  Huakui Wang,et al.  An Indirect Range-Doppler Algorithm for Multireceiver Synthetic Aperture Sonar Based on Lagrange Inversion Theorem , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[27]  Martin Haenggi,et al.  Scalable Transmission Over Heterogeneous Networks: A Stochastic Geometry Analysis , 2017, IEEE Transactions on Vehicular Technology.

[28]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[30]  Chong Luo,et al.  DaC-RAN: A data-assisted cloud radio access network for visual communications , 2015, IEEE Wireless Communications.

[31]  Feng Wu,et al.  Structure-Preserving Hybrid Digital-Analog Video Delivery in Wireless Networks , 2015, IEEE Transactions on Multimedia.

[32]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[35]  Andrew R. Nix,et al.  Raptor Code-Aware Link Adaptation for Spectrally Efficient Unicast Video Streaming over Mobile Broadband Networks , 2015, IEEE Transactions on Mobile Computing.

[36]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Feng Wu,et al.  ParCast+: Parallel Video Unicast in MIMO-OFDM WLANs , 2014, IEEE Transactions on Multimedia.

[38]  Somnath Sengupta,et al.  Detection of Moving Objects Using Multi-channel Kernel Fuzzy Correlogram Based Background Subtraction , 2014, IEEE Transactions on Cybernetics.

[39]  Carlos A. Fajardo,et al.  FPGA Implementation of a Huffman Decoder for High Speed Seismic Data Decompression , 2014, 2014 Data Compression Conference.

[40]  Martin Reisslein,et al.  Video Traffic Characteristics of Modern Encoding Standards: H.264/AVC with SVC and MVC Extensions and H.265/HEVC , 2014, TheScientificWorldJournal.

[41]  Hongyu Zhao,et al.  Low-Rank Modeling and Its Applications in Image Analysis , 2014, ACM Comput. Surv..

[42]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Martin Reisslein,et al.  Traffic and Statistical Multiplexing Characterization of 3-D Video Representation Formats , 2013, IEEE Transactions on Broadcasting.

[44]  Hamid R. Rabiee,et al.  Error control for multimedia communications in wireless sensor networks: A comparative performance analysis , 2012, Ad Hoc Networks.

[45]  Martin Reisslein,et al.  H.264 Coarse Grain Scalable (CGS) and Medium Grain Scalable (MGS) Encoded Video: A Trace Based Traffic and Quality Evaluation , 2012, IEEE Transactions on Broadcasting.

[46]  Wen Gao,et al.  Distributed Soft Video Broadcast (DCAST) with Explicit Motion , 2012, 2012 Data Compression Conference.

[47]  Dapeng Wu,et al.  Improved Estimation of Transmission Distortion for Error-Resilient Video Coding , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[48]  Vorapoj Patanavijit,et al.  A Novel Robust and High Reliability for Lucas-Kanade Optical Flow Algorithm Using Median Filter and Confidence Based Technique , 2012, 2012 26th International Conference on Advanced Information Networking and Applications Workshops.

[49]  Martin Reisslein,et al.  Video Transport Evaluation With H.264 Video Traces , 2012, IEEE Communications Surveys & Tutorials.

[50]  Dina Katabi,et al.  A cross-layer design for scalable mobile video , 2011, MobiCom.

[51]  Djemel Ziou,et al.  Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.

[52]  Xin-Lin Huang,et al.  Knowledge-Enhanced Mobile Video Broadcasting Framework With Cloud Support , 2017, IEEE Transactions on Circuits and Systems for Video Technology.