Multi-level Cross-modal Interaction Network for RGB-D Salient Object Detection

Depth cues with affluent spatial information have been proven beneficial in boosting salient object detection (SOD), while the depth quality directly affects the subsequent SOD performance. However, it is inevitable to obtain some low-quality depth cues due to limitations of its acquisition devices, which can inhibit the SOD performance. Besides, existing methods tend to combine RGB images and depth cues in a direct fusion or a simple fusion module, which makes they can not effectively exploit the complex correlations between the two sources. Moreover, few methods design an appropriate module to fully fuse multi-level features, resulting in cross-level feature interaction insufficient. To address these issues, we propose a novel Multi-level Cross-modal Interaction Network (MCINet) for RGB-D based SOD. Our MCI-Net includes two key components: 1) a cross-modal feature learning network, which is used to learn the high-level features for the RGB images and depth cues, effectively enabling the correlations between the two sources to be exploited; and 2) a multi-level interactive integration network, which integrates multi-level cross-modal features to boost the SOD performance. Extensive experiments on six benchmark datasets demonstrate the superiority of our MCI-Net over 14 state-of-the-art methods, and validate the effectiveness of different components in our MCI-Net. More important, our MCI-Net significantly improves the SOD performance as well as has a higher FPS.

[1]  Zheng Lin,et al.  Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Tian Xia,et al.  RGB-T Image Saliency Detection via Collaborative Graph Learning , 2019, IEEE Transactions on Multimedia.

[3]  Dacheng Tao,et al.  Deep Ordinal Regression Network for Monocular Depth Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Nick Barnes,et al.  UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[6]  Ge Li,et al.  A Three-Pathway Psychobiological Framework of Salient Object Detection Using Stereoscopic Technology , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[7]  Tongwei Ren,et al.  Salient object detection for RGB-D image via saliency evolution , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[8]  Jiandong Tian,et al.  RGBD Salient Object Detection via Deep Fusion , 2016, IEEE Transactions on Image Processing.

[9]  Tao Li,et al.  Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[10]  Youfu Li,et al.  Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Qingming Huang,et al.  F³Net: Fusion, Feedback and Focus for Salient Object Detection , 2020, AAAI.

[12]  Qingming Huang,et al.  Saliency Detection for Stereoscopic Images Based on Depth Confidence Analysis and Multiple Cues Fusion , 2016, IEEE Signal Processing Letters.

[13]  Ran Ju,et al.  Depth saliency based on anisotropic center-surround difference , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[14]  Shunli Zhang,et al.  Monocular depth estimation with guidance of surface normal map , 2017, Neurocomputing.

[15]  Wei Zhang,et al.  Salient object detection for RGB-D image by single stream recurrent convolution neural network , 2019, Neurocomputing.

[16]  Wenguan Wang,et al.  Shifting More Attention to Video Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ling Shao,et al.  Camouflaged Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Qingming Huang,et al.  F3Net: Fusion, Feedback and Focus for Salient Object Detection , 2019, AAAI.

[19]  Xing Cai,et al.  PDNet: Prior-Model Guided Depth-Enhanced Network for Salient Object Detection , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[20]  Zhe Wu,et al.  Cascaded Partial Decoder for Fast and Accurate Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Xiaogang Wang,et al.  Person Re-Identification by Saliency Learning , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[23]  Yael Pritch,et al.  Saliency filters: Contrast based filtering for salient region detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Deyu Meng,et al.  Co-Saliency Detection via a Self-Paced Multiple-Instance Learning Framework , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Huazhu Fu,et al.  Taking a Deeper Look at Co-Salient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Wei Ji,et al.  Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[28]  Nick Barnes,et al.  Local Background Enclosure for RGB-D Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Ruigang Yang,et al.  Saliency-Aware Video Object Segmentation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Wei Zhang,et al.  A cross-modal adaptive gated fusion generative adversarial network for RGB-D salient object detection , 2020, Neurocomputing.

[31]  Xiaojin Gong,et al.  Adaptive Fusion for RGB-D Salient Object Detection , 2019, IEEE Access.

[32]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Zhao Zhang,et al.  Bilateral Attention Network for RGB-D Salient Object Detection , 2020, IEEE Transactions on Image Processing.

[34]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[36]  Ming-Ming Cheng,et al.  EGNet: Edge Guidance Network for Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Yang Cao,et al.  Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Lihi Zelnik-Manor,et al.  How to Evaluate Foreground Maps , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Amitabh Varshney,et al.  Saliency-guided Enhancement for Volume Visualization , 2006, IEEE Transactions on Visualization and Computer Graphics.

[40]  Nuno Vasconcelos,et al.  Biologically Inspired Object Tracking Using Center-Surround Saliency Mechanisms , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Huan Du,et al.  Depth-Aware Salient Object Detection and Segmentation via Multiscale Discriminative Saliency Fusion and Bootstrap Learning , 2017, IEEE Transactions on Image Processing.

[42]  Haibin Ling,et al.  ICNet: Information Conversion Network for RGB-D Based Salient Object Detection , 2020, IEEE Transactions on Image Processing.

[43]  Youfu Li,et al.  Three-Stream Attention-Aware Network for RGB-D Salient Object Detection , 2019, IEEE Transactions on Image Processing.

[44]  Lihi Zelnik-Manor,et al.  Context-aware saliency detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  Junwei Han,et al.  CNNs-Based RGB-D Saliency Detection via Cross-View Transfer and Multiview Fusion. , 2018, IEEE transactions on cybernetics.

[46]  Bo Ren,et al.  Enhanced-alignment Measure for Binary Foreground Map Evaluation , 2018, IJCAI.

[47]  Rongrong Ji,et al.  RGBD Salient Object Detection: A Benchmark and Algorithms , 2014, ECCV.

[48]  Huchuan Lu,et al.  Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Dan Su,et al.  Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection , 2019, Pattern Recognit..

[50]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  K. Madhava Krishna,et al.  Depth really Matters: Improving Visual Salient Region Detection with Depth , 2013, BMVC.

[52]  Qijun Zhao,et al.  JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Xiaochun Cao,et al.  Depth Enhanced Saliency Detection Method , 2014, ICIMCS '14.

[54]  Nick Barnes,et al.  Learning RGB-D Salient Object Detection Using Background Enclosure, Depth Contrast, and Top-Down Features , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[55]  Ronggang Wang,et al.  An Innovative Salient Object Detection Using Center-Dark Channel Prior , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).