SCG: Saliency and Contour Guided Salient Instance Segmentation

Different from conventional instance segmentation, salient instance segmentation (SIS) faces two difficulties. The first is that it involves segmenting salient instances only while ignoring background, and the second is that it targets generic object instances without pre-defined object categories. In this paper, based on the state-of-the-art Mask R-CNN model, we propose to leverage complementary saliency and contour information to handle these two challenges. We first improve Mask R-CNN by introducing an interleaved execution strategy and proposing a novel mask head network to incorporate global context within each RoI. Then we add two branches to Mask R-CNN for saliency and contour detection, respectively. We fuse the Mask R-CNN features with the saliency and contour features, where the former supply pixel-wise saliency information to help with identifying salient regions and the latter provide a generic object contour prior to help detect and segment generic objects. We also propose a novel multiscale global attention model to generate attentive global features from multiscale representative features for feature fusion. Experimental results demonstrate that all our proposed model components can improve SIS performance. Finally, our overall model outperforms state-of-the-art SIS methods and Mask R-CNN by more than 6% and 3%, respectively. By using additional multitask training data, we can further improve the model performance on the ILSO dataset.