论文信息 - Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion

RGB-D salient object detection (SOD) is usually formulated as a problem of classification or regression over two modalities, i.e., RGB and depth. Hence, effective RGB-D feature modeling and multi-modal feature fusion both play a vital role in RGB-D SOD. In this paper, we propose a depth-sensitive RGB feature modeling scheme using the depth-wise geometric prior of salient objects. In principle, the feature modeling scheme is carried out in a depth-sensitive attention module, which leads to the RGB feature enhancement as well as the background distraction reduction by capturing the depth geometry prior. More-over, to perform effective multi-modal feature fusion, we further present an automatic architecture search approach for RGB-D SOD, which does well in finding out a feasible architecture from our specially designed multi-modal multi-scale search space. Extensive experiments on seven standard benchmarks demonstrate the effectiveness of the proposed approach against the state-of-the-art.

[1] Zhi Liu,et al. Salient region detection for stereoscopic images , 2014, 2014 19th International Conference on Digital Signal Processing.

[2] Huan Du,et al. Depth-Aware Salient Object Detection and Segmentation via Multiscale Discriminative Saliency Fusion and Bootstrap Learning , 2017, IEEE Transactions on Image Processing.

[3] Quoc V. Le,et al. Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[4] Youfu Li,et al. Three-Stream Attention-Aware Network for RGB-D Salient Object Detection , 2019, IEEE Transactions on Image Processing.

[5] Michael Ying Yang,et al. Exploiting global priors for RGB-D saliency detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6] Ling Shao,et al. BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network , 2020, ECCV.

[7] Lei Zhang,et al. A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection , 2020, ECCV.

[8] Yue Gao,et al. 3-D Object Retrieval and Recognition With Hypergraph Analysis , 2012, IEEE Transactions on Image Processing.

[9] Theodore Lim,et al. SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[10] Xueqing Li,et al. Leveraging stereopsis for saliency analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12] Wei Zhang,et al. Salient object detection for RGB-D image by single stream recurrent convolution neural network , 2019, Neurocomputing.

[13] Ran Ju,et al. Depth saliency based on anisotropic center-surround difference , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[14] Wenguan Wang,et al. Shifting More Attention to Video Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Frédéric Jurie,et al. MFAS: Multimodal Fusion Architecture Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Nuno Vasconcelos,et al. Saliency-based discriminant tracking , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Patrice Marcotte,et al. An overview of bilevel optimization , 2007, Ann. Oper. Res..

[18] Nick Barnes,et al. Local Background Enclosure for RGB-D Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Harish Katti,et al. Depth Matters: Influence of Depth Cues on Visual Saliency , 2012, ECCV.

[20] Xing Cai,et al. PDNet: Prior-Model Guided Depth-Enhanced Network for Salient Object Detection , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[21] Jie Liu,et al. Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection , 2020, ECCV.

[22] Wei Ji,et al. Depth-Induced Multi-Scale Recurrent Attention Network for Saliency Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23] Shuhan Chen,et al. Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection , 2020, ECCV.

[24] Linwei Ye,et al. Cross-Modal Weighting Network for RGB-D Salient Object Detection , 2020, ECCV.

[25] James M. Rehg,et al. An In Depth View of Saliency , 2013, BMVC.

[26] Haibin Ling,et al. Saliency Detection on Light Field , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] Wei Ji,et al. Accurate RGB-D Salient Object Detection via Collaborative Learning , 2020, ECCV.

[28] Ramesh Raskar,et al. Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[29] Zheng Lin,et al. Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[30] Li Fei-Fei,et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Nick Barnes,et al. UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Junwei Han,et al. Learning Selective Self-Mutual Attention for RGB-D Saliency Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Ge Li,et al. A Three-Pathway Psychobiological Framework of Salient Object Detection Using Stereoscopic Technology , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[34] Huchuan Lu,et al. Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection , 2020, ECCV.

[35] Xiaochun Cao,et al. Depth Enhanced Saliency Detection Method , 2014, ICIMCS '14.

[36] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[37] Seunghoon Hong,et al. Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[38] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.

[39] Dan Su,et al. Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection , 2019, Pattern Recognit..

[40] Hang Xu,et al. Auto-FPN: Automatic Network Architecture Adaptation for Object Detection Beyond Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41] Terry L. Friesz,et al. Hierarchical optimization: An introduction , 1992, Ann. Oper. Res..

[42] Nick Barnes,et al. Learning RGB-D Salient Object Detection Using Background Enclosure, Depth Contrast, and Top-Down Features , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[43] Xiaogang Wang,et al. Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44] Jiandong Tian,et al. RGBD Salient Object Detection via Deep Fusion , 2016, IEEE Transactions on Image Processing.

[45] Jianping Shi,et al. Graph-Guided Architecture Search for Real-Time Semantic Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[47] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[48] Quoc V. Le,et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Ali Borji,et al. Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[50] Yongri Piao,et al. A2dele: Adaptive and Attentive Depth Distiller for Efficient RGB-D Salient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51] Yang Cao,et al. Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Dacheng Tao,et al. Deep Multimodal Neural Architecture Search , 2020, ACM Multimedia.

[53] Yongri Piao,et al. Select, Supplement and Focus for RGB-D Saliency Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55] Siwei Lyu,et al. Cascade Graph Neural Networks for RGB-D Salient Object Detection , 2020, ECCV.

[56] Runmin Cong,et al. RGB-D Salient Object Detection with Cross-Modality Modulation and Selection , 2020, ECCV.

[57] Sabine Süsstrunk,et al. Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[58] Tao Li,et al. Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[59] Bo Ren,et al. Enhanced-alignment Measure for Binary Foreground Map Evaluation , 2018, IJCAI.

[60] Rongrong Ji,et al. RGBD Salient Object Detection: A Benchmark and Algorithms , 2014, ECCV.

[61] Guanghai Liu,et al. A Model of Visual Attention for Natural Image Retrieval , 2013, 2013 International Conference on Information Science and Cloud Computing Companion.

[62] Fatih Murat Porikli,et al. Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).