Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection

The popularity and promotion of depth maps have brought new vigor and vitality into salient object detection (SOD), and a mass of RGB-D SOD algorithms have been proposed, mainly concentrating on how to better integrate cross-modality features from RGB image and depth map. For the cross-modality interaction in feature encoder, existing methods either indiscriminately treat RGB and depth modalities, or only habitually utilize depth cues as auxiliary information of the RGB branch. Different from them, we reconsider the status of two modalities and propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD, which differentially models the dependence of two modalities according to the feature representations of different layers. To this end, two components are designed to implement the effective cross-modality interaction: 1) the RGB-induced Detail Enhancement (RDE) module leverages RGB modality to enhance the details of the depth features in low-level encoder stage. 2) the Depth-induced Semantic Enhancement (DSE) module transfers the object positioning and internal consistency of depth features to the RGB branch in high-level encoder stage. Furthermore, we also design a Dense Decoding Reconstruction (DDR) structure, which constructs a semantic block by combining multi-level encoder features to upgrade the skip connection in the feature decoding. Extensive experiments on five benchmark datasets demonstrate that our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively. Our code is publicly available at:https://rmcong.github.io/proj_CDINet.html.

[1]  Qingming Huang,et al.  HSCS: Hierarchical Sparsity Based Co-saliency Detection for RGBD Images , 2018, IEEE Transactions on Multimedia.

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Zhou Huang,et al.  Multi-level Cross-modal Interaction Network for RGB-D Salient Object Detection , 2020, Neurocomputing.

[4]  Runmin Cong,et al.  RGB-D Salient Object Detection with Cross-Modality Modulation and Selection , 2020, ECCV.

[5]  Xiaochun Cao,et al.  Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images , 2020, IEEE Transactions on Image Processing.

[6]  Dan Su,et al.  Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection , 2019, Pattern Recognit..

[7]  Tao Li,et al.  Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[8]  Qingming Huang,et al.  Saliency Detection for Stereoscopic Images Based on Depth Confidence Analysis and Multiple Cues Fusion , 2016, IEEE Signal Processing Letters.

[9]  Lei Zhang,et al.  A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection , 2020, ECCV.

[10]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Ran Ju,et al.  Depth saliency based on anisotropic center-surround difference , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[12]  Qingming Huang,et al.  Video Saliency Detection via Sparsity-Based Reconstruction and Propagation , 2019, IEEE Transactions on Image Processing.

[13]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[14]  Qingming Huang,et al.  ASIF-Net: Attention Steered Interweave Fusion Network for RGB-D Salient Object Detection , 2020, IEEE Transactions on Cybernetics.

[15]  Yang Cao,et al.  Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Sam Kwong,et al.  Nested Network With Two-Stream Pyramid for Salient Object Detection in Optical Remote Sensing Images , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Rongrong Ji,et al.  RGBD Salient Object Detection: A Benchmark and Algorithms , 2014, ECCV.

[18]  Yongri Piao,et al.  A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Nick Barnes,et al.  Local Background Enclosure for RGB-D Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jun Fu,et al.  Dual Attention Network for Scene Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Youfu Li,et al.  Three-Stream Attention-Aware Network for RGB-D Salient Object Detection , 2019, IEEE Transactions on Image Processing.

[23]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yongri Piao,et al.  Select, Supplement and Focus for RGB-D Saliency Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[26]  Qingming Huang,et al.  An Iterative Co-Saliency Framework for RGBD Images , 2017, IEEE Transactions on Cybernetics.

[27]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[28]  Zhao Zhang,et al.  Bilateral Attention Network for RGB-D Salient Object Detection , 2020, IEEE Transactions on Image Processing.

[29]  Qingming Huang,et al.  DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection. , 2020, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[30]  Runmin Cong,et al.  Co-Saliency Detection for RGBD Images Based on Multi-Constraint Feature Matching and Cross Label Propagation. , 2018, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[31]  Shuhan Chen,et al.  Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection , 2020, ECCV.

[32]  Haibin Ling,et al.  Saliency Detection on Light Field , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[34]  Zheng Lin,et al.  Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Huchuan Lu,et al.  Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection , 2020, ECCV.

[36]  Qijun Zhao,et al.  JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Qingming Huang,et al.  Co-Saliency Detection for RGBD Images Based on Multi-Constraint Feature Matching and Cross Label Propagation , 2017, IEEE Transactions on Image Processing.

[38]  Junwei Han,et al.  Learning Selective Self-Mutual Attention for RGB-D Saliency Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Qingming Huang,et al.  Image Saliency Detection Video Saliency Detection Co-saliency Detection Temporal RGBD Saliency Detection Motion , 2018 .

[40]  Qingming Huang,et al.  Going From RGB to RGBD Saliency: A Depth-Guided Transformation Model , 2020, IEEE Transactions on Cybernetics.

[41]  Yongri Piao,et al.  Feature Reintegration over Differential Treatment: A Top-down and Adaptive Fusion Network for RGB-D Salient Object Detection , 2020, ACM Multimedia.

[42]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[43]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[44]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[45]  Jie Liu,et al.  Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection , 2020, ECCV.

[46]  Wei Ji,et al.  Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).