Rethinking Click Embedding for Deep Interactive Image Segmentation

In interactive image segmentation methods, users can participate in and influence the segmentation process through their interactions, such as scribbles or bounding boxes. Similarly, the process of deep interactive segmentation utilizes users’ interactions to guide the network to learn the target of interest. This article mainly considers mouse clicking, which is the simplest interaction mode. Then, how to effectively characterize the click interaction (we call this “click encoding”) and fuse the click-related information with the network are the key issues in a deep interactive segmentation framework. However, the current click encoding method concentrates only on the spatial information of the clicks, so the region affected by each click is difficult to control, and the stability of the network is therefore reduced. Therefore, we propose a feature-interactive map that builds a close relationship between interaction information and target semantics. The affected region of the feature interactive map is determined by semantic information. Furthermore, we introduce an interactive nonlocal block by embedding a feature-interactive map into a nonlocal block, so that the long-range dependencies of the interaction information can be captured. Finally, based on the early fusion strategy, the features of the interactive nonlocal block are fused with the high-level features, thus amplifying the impacts of position and semantics on the final prediction results. Comprehensive experiments demonstrate that our click embedding approach significantly boosts the efficiency of the network and achieves state-of-the-art performance.

[1]  Jian Yang,et al.  Interactive Image Segmentation Based on Label Pair Diffusion , 2021, IEEE Transactions on Industrial Informatics.

[2]  Kai Zhao,et al.  Res2Net: A New Multi-Scale Backbone Architecture , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Zhao Zhang,et al.  Interactive Image Segmentation With First Click Attention , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ilia Petrov,et al.  F-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[6]  Angela Yao,et al.  Content-Aware Multi-Level Guidance for Interactive Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Chang-Su Kim,et al.  Interactive Image Segmentation via Backpropagating Refinement Scheme , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Rodrigo Benenson,et al.  Large-Scale Interactive Object Segmentation With Human Annotators , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yang Hu,et al.  A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation , 2018, Neural Networks.

[10]  Zhuwen Li,et al.  Interactive Image Segmentation with Latent Diversity , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Bastian Leibe,et al.  Iteratively Trained Interactive Segmentation , 2018, BMVC.

[12]  Kaiqi Huang,et al.  Fast End-to-End Trainable Guided Filter , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Luc Van Gool,et al.  Deep Extreme Cut: From Extreme Points to Object Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Sim Heng Ong,et al.  Regional Interactive Image Segmentation Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Mehmet Karaköse,et al.  A vision based inspection system using gaussian mixture model based interactive segmentation , 2017, 2017 International Artificial Intelligence and Data Processing Symposium (IDAP).

[17]  Nguyen Truong Thinh,et al.  Using ANFIS to predict picking position of the fruits sorting system , 2017, 2017 International Conference on System Science and Engineering (ICSSE).

[18]  Amit Kumar Singh,et al.  Automatic sorting of object by their colour and dimension with speed or process control of induction motor , 2017, 2017 International Conference on Circuit ,Power and Computing Technologies (ICCPCT).

[19]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ning Xu,et al.  Deep Interactive Object Selection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Fei-Fei Li,et al.  What's the Point: Semantic Segmentation with Point Supervision , 2015, ECCV.

[23]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[24]  J. Andrew Bagnell,et al.  Interactive segmentation, tracking, and kinematic modeling of unknown 3D articulated objects , 2013, 2013 IEEE International Conference on Robotics and Automation.

[25]  Henrik I. Christensen,et al.  Interactive object modeling & labeling for service robots , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[26]  Naif Alajlan,et al.  Interactive Segmentation for Change Detection in Multispectral Remote-Sensing Images , 2013, IEEE Geoscience and Remote Sensing Letters.

[27]  Subhransu Maji,et al.  Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[28]  Andrew Blake,et al.  Geodesic star convexity for interactive image segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Noel E. O'Connor,et al.  A comparative evaluation of interactive segmentation algorithms , 2010, Pattern Recognit..

[30]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[31]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Leo Grady,et al.  Random Walks for Image Segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Vladimir Kolmogorov,et al.  "GrabCut": interactive foreground extraction using iterated graph cuts , 2004, ACM Trans. Graph..

[34]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.