Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion

RGB-D salient object detection (SOD) is usually formulated as a problem of classification or regression over two modalities, i.e., RGB and depth. Hence, effective RGB-D feature modeling and multi-modal feature fusion both play a vital role in RGB-D SOD. In this paper, we propose a depth-sensitive RGB feature modeling scheme using the depth-wise geometric prior of salient objects. In principle, the feature modeling scheme is carried out in a depth-sensitive attention module, which leads to the RGB feature enhancement as well as the background distraction reduction by capturing the depth geometry prior. More-over, to perform effective multi-modal feature fusion, we further present an automatic architecture search approach for RGB-D SOD, which does well in finding out a feasible architecture from our specially designed multi-modal multi-scale search space. Extensive experiments on seven standard benchmarks demonstrate the effectiveness of the proposed approach against the state-of-the-art.

[1]  Zhi Liu,et al.  Salient region detection for stereoscopic images , 2014, 2014 19th International Conference on Digital Signal Processing.

[2]  Huan Du,et al.  Depth-Aware Salient Object Detection and Segmentation via Multiscale Discriminative Saliency Fusion and Bootstrap Learning , 2017, IEEE Transactions on Image Processing.

[3]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[4]  Youfu Li,et al.  Three-Stream Attention-Aware Network for RGB-D Salient Object Detection , 2019, IEEE Transactions on Image Processing.

[5]  Michael Ying Yang,et al.  Exploiting global priors for RGB-D saliency detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6]  Ling Shao,et al.  BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network , 2020, ECCV.

[7]  Lei Zhang,et al.  A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection , 2020, ECCV.

[8]  Yue Gao,et al.  3-D Object Retrieval and Recognition With Hypergraph Analysis , 2012, IEEE Transactions on Image Processing.

[9]  Theodore Lim,et al.  SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[10]  Xueqing Li,et al.  Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Wei Zhang,et al.  Salient object detection for RGB-D image by single stream recurrent convolution neural network , 2019, Neurocomputing.

[13]  Ran Ju,et al.  Depth saliency based on anisotropic center-surround difference , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[14]  Wenguan Wang,et al.  Shifting More Attention to Video Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Frédéric Jurie,et al.  MFAS: Multimodal Fusion Architecture Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Nuno Vasconcelos,et al.  Saliency-based discriminant tracking , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Patrice Marcotte,et al.  An overview of bilevel optimization , 2007, Ann. Oper. Res..

[18]  Nick Barnes,et al.  Local Background Enclosure for RGB-D Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Harish Katti,et al.  Depth Matters: Influence of Depth Cues on Visual Saliency , 2012, ECCV.

[20]  Xing Cai,et al.  PDNet: Prior-Model Guided Depth-Enhanced Network for Salient Object Detection , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[21]  Jie Liu,et al.  Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection , 2020, ECCV.

[22]  Wei Ji,et al.  Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Shuhan Chen,et al.  Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection , 2020, ECCV.

[24]  Linwei Ye,et al.  Cross-Modal Weighting Network for RGB-D Salient Object Detection , 2020, ECCV.

[25]  James M. Rehg,et al.  An In Depth View of Saliency , 2013, BMVC.

[26]  Haibin Ling,et al.  Saliency Detection on Light Field , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Wei Ji,et al.  Accurate RGB-D Salient Object Detection via Collaborative Learning , 2020, ECCV.

[28]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[29]  Zheng Lin,et al.  Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Li Fei-Fei,et al.  Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Nick Barnes,et al.  UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Junwei Han,et al.  Learning Selective Self-Mutual Attention for RGB-D Saliency Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Ge Li,et al.  A Three-Pathway Psychobiological Framework of Salient Object Detection Using Stereoscopic Technology , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[34]  Huchuan Lu,et al.  Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection , 2020, ECCV.

[35]  Xiaochun Cao,et al.  Depth Enhanced Saliency Detection Method , 2014, ICIMCS '14.

[36]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[37]  Seunghoon Hong,et al.  Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[38]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[39]  Dan Su,et al.  Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection , 2019, Pattern Recognit..

[40]  Hang Xu,et al.  Auto-FPN: Automatic Network Architecture Adaptation for Object Detection Beyond Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Terry L. Friesz,et al.  Hierarchical optimization: An introduction , 1992, Ann. Oper. Res..

[42]  Nick Barnes,et al.  Learning RGB-D Salient Object Detection Using Background Enclosure, Depth Contrast, and Top-Down Features , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[43]  Xiaogang Wang,et al.  Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Jiandong Tian,et al.  RGBD Salient Object Detection via Deep Fusion , 2016, IEEE Transactions on Image Processing.

[45]  Jianping Shi,et al.  Graph-Guided Architecture Search for Real-Time Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[47]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[48]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[50]  Yongri Piao,et al.  A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Yang Cao,et al.  Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Dacheng Tao,et al.  Deep Multimodal Neural Architecture Search , 2020, ACM Multimedia.

[53]  Yongri Piao,et al.  Select, Supplement and Focus for RGB-D Saliency Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Siwei Lyu,et al.  Cascade Graph Neural Networks for RGB-D Salient Object Detection , 2020, ECCV.

[56]  Runmin Cong,et al.  RGB-D Salient Object Detection with Cross-Modality Modulation and Selection , 2020, ECCV.

[57]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Tao Li,et al.  Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[59]  Bo Ren,et al.  Enhanced-alignment Measure for Binary Foreground Map Evaluation , 2018, IJCAI.

[60]  Rongrong Ji,et al.  RGBD Salient Object Detection: A Benchmark and Algorithms , 2014, ECCV.

[61]  Guanghai Liu,et al.  A Model of Visual Attention for Natural Image Retrieval , 2013, 2013 International Conference on Information Science and Cloud Computing Companion.

[62]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).