DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection.

There are two main issues in RGB-D salient object detection: (1) how to effectively integrate the complementarity from the cross-modal RGB-D data; (2) how to prevent the contamination effect from the unreliable depth map. In fact, these two problems are linked and intertwined, but the previous methods tend to focus only on the first problem and ignore the consideration of depth map quality, which may yield the model fall into the sub-optimal state. In this paper, we address these two issues in a holistic model synergistically, and propose a novel network named DPANet to explicitly model the potentiality of the depth map and effectively integrate the cross-modal complementarity. By introducing the depth potentiality perception, the network can perceive the potentiality of depth information in a learning-based manner, and guide the fusion process of two modal data to prevent the contamination occurred. The gated multi-modality attention module in the fusion process exploits the attention mechanism with a gate controller to capture long-range dependencies from a cross-modal perspective. Experimental results compared with 16 state-of-the-art methods on 8 datasets demonstrate the validity of the proposed approach both quantitatively and qualitatively.

[1]  Nick Barnes,et al.  Local Background Enclosure for RGB-D Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jing Zhang,et al.  Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Ruigang Yang,et al.  Saliency-Aware Video Object Segmentation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Gangyi Jiang,et al.  Optimizing Multistage Discriminative Dictionaries for Blind Image Quality Assessment , 2018, IEEE Transactions on Multimedia.

[5]  Qingming Huang,et al.  Image Saliency Detection Video Saliency Detection Co-saliency Detection Temporal RGBD Saliency Detection Motion , 2018 .

[6]  Zheng Lin,et al.  Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Feiping Nie,et al.  Detecting Coherent Groups in Crowd Scenes by Multiview Clustering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Xiaogang Wang,et al.  Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Xing Cai,et al.  PDNet: Prior-Model Guided Depth-Enhanced Network for Salient Object Detection , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[11]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[12]  Xiaojin Gong,et al.  Adaptive Fusion for RGB-D Salient Object Detection , 2019, IEEE Access.

[13]  Zhuowen Tu,et al.  Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ming-Ming Cheng,et al.  EGNet: Edge Guidance Network for Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Huchuan Lu,et al.  Multi-Source Weak Supervision for Saliency Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ge Li,et al.  A Three-Pathway Psychobiological Framework of Salient Object Detection Using Stereoscopic Technology , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[17]  Jitendra Malik,et al.  Learning Rich Features from RGB-D Images for Object Detection and Segmentation , 2014, ECCV.

[18]  Jiandong Tian,et al.  RGBD Salient Object Detection via Deep Fusion , 2016, IEEE Transactions on Image Processing.

[19]  Zhi Liu,et al.  Salient region detection for stereoscopic images , 2014, 2014 19th International Conference on Digital Signal Processing.

[20]  Fahimeh Fooladgar,et al.  A survey on indoor RGB-D semantic segmentation: from hand-crafted features to deep convolutional neural networks , 2019, Multimedia Tools and Applications.

[21]  Huan Du,et al.  Depth-Aware Salient Object Detection and Segmentation via Multiscale Discriminative Saliency Fusion and Bootstrap Learning , 2017, IEEE Transactions on Image Processing.

[22]  Qingming Huang,et al.  ASIF-Net: Attention Steered Interweave Fusion Network for RGB-D Salient Object Detection , 2020, IEEE Transactions on Cybernetics.

[23]  Bo Du,et al.  Saliency-Guided Unsupervised Feature Learning for Scene Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Chao Gao,et al.  BASNet: Boundary-Aware Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Runmin Cong,et al.  Underwater Image Enhancement by Dehazing With Minimum Information Loss and Histogram Distribution Prior , 2016, IEEE Transactions on Image Processing.

[26]  Bing Li,et al.  Salient Object Detection via Structured Matrix Decomposition. , 2016, IEEE transactions on pattern analysis and machine intelligence.

[27]  Qingming Huang,et al.  An Iterative Co-Saliency Framework for RGBD Images , 2017, IEEE Transactions on Cybernetics.

[28]  Haibin Ling,et al.  Saliency Detection on Light Field , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Kwan-Liu Ma,et al.  Stereoscopic Thumbnail Creation via Efficient Stereo Saliency Detection , 2017, IEEE Transactions on Visualization and Computer Graphics.

[30]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Ran Ju,et al.  Depth saliency based on anisotropic center-surround difference , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[32]  Dewen Hu,et al.  Salient Region Detection via Integrating Diffusion-Based Compactness and Local Contrast , 2015, IEEE Transactions on Image Processing.

[33]  Jianmin Jiang,et al.  A Simple Pooling-Based Design for Real-Time Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Wei Ji,et al.  Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Youfu Li,et al.  Three-Stream Attention-Aware Network for RGB-D Salient Object Detection , 2019, IEEE Transactions on Image Processing.

[36]  Junwei Han,et al.  CNNs-Based RGB-D Saliency Detection via Cross-View Transfer and Multiview Fusion. , 2018, IEEE transactions on cybernetics.

[37]  Hao Chen,et al.  CNNs-Based RGB-D Saliency Detection via Cross-View Transfer and Multiview Fusion , 2017 .

[38]  Henrik I. Christensen,et al.  RGB-D object tracking: A particle filter approach on GPU , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[39]  Qingming Huang,et al.  Co-Saliency Detection for RGBD Images Based on Multi-Constraint Feature Matching and Cross Label Propagation , 2017, IEEE Transactions on Image Processing.

[40]  Yang Cao,et al.  Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Guoqiang Han,et al.  R³Net: Recurrent Residual Refinement Network for Saliency Detection , 2018, IJCAI.

[42]  Huchuan Lu,et al.  Attentive Feedback Network for Boundary-Aware Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Qingming Huang,et al.  Global Context-Aware Progressive Aggregation Network for Salient Object Detection , 2020, AAAI.

[44]  Huchuan Lu,et al.  Saliency Detection via Dense and Sparse Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision.

[45]  Xiaochun Cao,et al.  Depth Enhanced Saliency Detection Method , 2014, ICIMCS '14.

[46]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[47]  Nick Barnes,et al.  Learning RGB-D Salient Object Detection Using Background Enclosure, Depth Contrast, and Top-Down Features , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[48]  Youfu Li,et al.  Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Haibin Ling,et al.  Salient Object Detection in the Deep Learning Era: An In-Depth Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Ting Zhao,et al.  Pyramid Feature Attention Network for Saliency Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Qingming Huang,et al.  Going From RGB to RGBD Saliency: A Depth-Guided Transformation Model , 2020, IEEE Transactions on Cybernetics.

[52]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[53]  Junhui Hou,et al.  CoADNet: Collaborative Aggregation-and-Distribution Networks for Co-Salient Object Detection , 2020, NeurIPS.

[54]  Qingming Huang,et al.  HSCS: Hierarchical Sparsity Based Co-saliency Detection for RGBD Images , 2018, IEEE Transactions on Multimedia.

[55]  Seunghoon Hong,et al.  Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[56]  Dan Su,et al.  Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection , 2019, Pattern Recognit..

[57]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[59]  Rongrong Ji,et al.  RGBD Salient Object Detection: A Benchmark and Algorithms , 2014, ECCV.

[60]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[61]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[62]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[63]  Runmin Cong,et al.  RGB-D Salient Object Detection with Cross-Modality Modulation and Selection , 2020, ECCV.

[64]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  Ming-Hsuan Yang,et al.  PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[66]  Tao Li,et al.  Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[67]  Qingming Huang,et al.  Saliency Detection for Stereoscopic Images Based on Depth Confidence Analysis and Multiple Cues Fusion , 2016, IEEE Signal Processing Letters.