Adaptive Multi-Modality Residual Network for Compression Distorted Multi-View Depth Video Enhancement

Compression distorted multi-view video plus depth (MVD) should be enhanced at the receiver side without the original signals, especially the depth maps because they describe the positioning information in 3D space and they are important for subsequent virtual view synthesis. However, challenge arises from how to exploit the contribution from multi-modality priors from neighboring viewpoints, and how to handle the gradient vanishing when textureless depth maps are involved. In this paper, we propose a multi-modality residual network to enhance the quality of compressed multi-view depth video. Taking advantage from high correlation among different viewpoints, depth maps from adjacent views are exploited as guidance for the enhancement of depth video in target view. Color frames in target view are also involved to offer the information object contours, obtaining multi-modality guidance. The proposed network is organized a deep residual network to well eliminate distortion and restore details. Because above multi-modality guidance have different correlations with target depth video and not all information can contribute to the enhancement, an adaptive skip structure is designed to further exploit the contribution from different priors appropriately. Experimental results show that our scheme outperforms other benchmarks and achieves an average 1.935 dB and 0.0227 gains on PSNR and SSIM over all test sequences, respectively. All results on objective, subjective and 3D reconstruction suggest that our method is able to provide superiority performance in practical applications.

[1]  Sebastian Thrun,et al.  A Noise‐aware Filter for Real‐time Depth Upsampling , 2008 .

[2]  Yu-Bin Yang,et al.  Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections , 2016, NIPS.

[3]  Ying Chen,et al.  Standardized Extensions of High Efficiency Video Coding (HEVC) , 2013, IEEE Journal of Selected Topics in Signal Processing.

[4]  Horst Bischof,et al.  Depth Restoration via Joint Training of a Global Regression Model and CNNs , 2015, BMVC.

[5]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[6]  B. Zeng,et al.  Candidate value-based boundary filtering for compressed depth images , 2015 .

[7]  Yao Zhao,et al.  Convolutional neural network-based depth image artifact removal , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[8]  Xia Li,et al.  A CNN cascade for quality enhancement of compressed depth images , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[9]  Dong Liu,et al.  A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding , 2016, MMM.

[10]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[11]  Ran Ma,et al.  Scalable Omnidirectional Video Coding for Real-Time Virtual Reality Applications , 2018, IEEE Access.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Xiaoou Tang,et al.  Compression Artifacts Reduction by a Deep Convolutional Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[15]  Toshiaki Fujii,et al.  FTV for 3-D Spatial Communication , 2012, Proceedings of the IEEE.

[16]  Seungyong Lee,et al.  Reconstruction-Based Pairwise Depth Dataset for Depth Image Enhancement Using CNN , 2018, ECCV.

[17]  Xin Zhang,et al.  Fast depth image denoising and enhancement using a deep convolutional network , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Ye Wang,et al.  Feature-Aware Trilateral Filter With Energy Minimization for 3D Mesh Denoising , 2020, IEEE Access.

[19]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[20]  Mohamed-Chaker Larabi,et al.  Perceptually Driven Nonuniform Asymmetric Coding of Stereoscopic 3D Video , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[21]  Minh N. Do,et al.  Depth Video Enhancement Based on Weighted Mode Filtering , 2012, IEEE Transactions on Image Processing.

[22]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[23]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Xin He,et al.  Cross-View Multi-Lateral Filter for Compressed Multi-View Depth Video , 2019, IEEE Transactions on Image Processing.

[25]  Guangming Shi,et al.  Denoising Prior Driven Deep Neural Network for Image Restoration , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Jing Zhang,et al.  Image guided depth enhancement via deep fusion and local linear regularizaron , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[27]  Namho Hur,et al.  Asymmetric Coding of Stereoscopic Video for Transmission Over T-DMB , 2007, 2007 3DTV Conference.

[28]  Lin Ma,et al.  Deep intensity guidance based compression artifacts reduction for depth map , 2018, J. Vis. Commun. Image Represent..

[29]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Qiang Wu,et al.  Variable Bandwidth Weighting for Texture Copy Artifact Suppression in Guided Depth Upsampling , 2017, IEEE Transactions on Circuits and Systems for Video Technology.