Personal Fixations-Based Object Segmentation With Object Localization and Boundary Preservation

As a natural way for human-computer interaction, fixation provides a promising solution for interactive image segmentation. In this paper, we focus on Personal Fixations-based Object Segmentation (PFOS) to address issues in previous studies, such as the lack of appropriate dataset and the ambiguity in fixations-based interaction. In particular, we first construct a new PFOS dataset by carefully collecting pixel-level binary annotation data over an existing fixation prediction dataset, such dataset is expected to greatly facilitate the study along the line. Then, considering characteristics of personal fixations, we propose a novel network based on Object Localization and Boundary Preservation (OLBP) to segment the gazed objects. Specifically, the OLBP network utilizes an Object LocalizationModule (OLM) to analyze personal fixations and locates the gazed objects based on the interpretation. Then, a Boundary Preservation Module (BPM) is designed to introduce additional boundary information to guard the completeness of the gazed objects. Moreover, OLBP is organized in the mixed bottom-up and top-down manner with multiple types of deep supervision. Extensive experiments on the constructed PFOS dataset show the superiority of the proposed OLBP network over 17 state-of-the-art methods, and demonstrate the effectiveness of the proposed OLM and BPM components. The constructed PFOS dataset and the proposed OLBP network are available at https://github.com/MathLee/OLBPNet4PFOS.

[1]  Zhenzhong Chen,et al.  Personalized Visual Saliency: Individuality Affects Image Perception , 2018, IEEE Access.

[2]  Qijun Zhao,et al.  JL-DCF: Joint Learning and Densely-Cooperative Fusion Framework for RGB-D Salient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[4]  Haibin Ling,et al.  Salient Object Detection in the Deep Learning Era: An In-Depth Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Qijun Zhao,et al.  Refinet: A Deep Segmentation Assisted Refinement Network for Salient Object Detection , 2019, IEEE Transactions on Multimedia.

[8]  Ghassan Hamarneh,et al.  Hands-free interactive image segmentation using eyegaze , 2009, Medical Imaging.

[9]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[10]  Zhe Wu,et al.  Cascaded Partial Decoder for Fast and Accurate Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Zhuwen Li,et al.  Interactive Image Segmentation with Latent Diversity , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Noel E. O'Connor,et al.  A comparative evaluation of interactive segmentation algorithms , 2010, Pattern Recognit..

[13]  Chang-Su Kim,et al.  Interactive Image Segmentation via Backpropagating Refinement Scheme , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Wenguan Wang,et al.  Deep Visual Attention Prediction , 2017, IEEE Transactions on Image Processing.

[16]  Gang Wang,et al.  Boundary-Aware Feature Propagation for Scene Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Loong Fah Cheong,et al.  Active Visual Segmentation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  King Ngi Ngan,et al.  Interactive object segmentation in two phases , 2018, Signal Process. Image Commun..

[19]  Zhi Liu,et al.  Saliency Prediction via Multi-Level Features and Deep Supervision for Children with Autism Spectrum Disorder , 2019, 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[20]  Qijun Zhao,et al.  Deepside: A general deep framework for salient object detection , 2019, Neurocomputing.

[21]  Shenghua Gao,et al.  Personalized Saliency and Its Prediction , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Sim Heng Ong,et al.  Regional Interactive Image Segmentation Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[24]  S. Hansen,et al.  The gender difference in gaze-cueing: Associations with empathizing and systemizing , 2010 .

[25]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[26]  Wenbin Zou,et al.  Saliency Tree: A Novel Saliency Detection Framework , 2014, IEEE Transactions on Image Processing.

[27]  Jianfei Cai,et al.  Robust Interactive Image Segmentation Using Convex Active Contours , 2012, IEEE Transactions on Image Processing.

[28]  King Ngi Ngan,et al.  Gaze-Based Object Segmentation , 2017, IEEE Signal Processing Letters.

[29]  Tianming Liu,et al.  Learning to Predict Eye Fixations via Multiresolution Convolutional Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[32]  Xiaogang Wang,et al.  Context Encoding for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Alexandre Xavier Falcao,et al.  Hybrid Approaches for Interactive Image Segmentation Using the Live Markers Paradigm , 2014, IEEE Transactions on Image Processing.

[34]  Bastian Leibe,et al.  Iteratively Trained Interactive Segmentation , 2018, BMVC.

[35]  Leo Grady,et al.  Random Walks for Image Segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Toby Sharp,et al.  Image segmentation with a bounding box prior , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[37]  Qi Zhao,et al.  SALICON: Saliency in Context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Ning Xu,et al.  Deep GrabCut for Object Selection , 2017, BMVC.

[39]  James M. Rehg,et al.  The Secrets of Salient Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Xiongkuo Min,et al.  How is Gaze Influenced by Image Transformations? Dataset and Model , 2019, IEEE Transactions on Image Processing.

[41]  Zhi Liu,et al.  Constrained fixation point based segmentation via deep neural network , 2019, Neurocomputing.

[42]  Ming-Ming Cheng,et al.  EGNet: Edge Guidance Network for Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Antoine Coutrot,et al.  Visual Attention Saccadic Models Learn to Emulate Gaze Patterns From Childhood to Adulthood , 2017, IEEE Transactions on Image Processing.

[44]  Qingming Huang,et al.  Stacked Cross Refinement Network for Edge-Aware Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Tao Li,et al.  Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[46]  Jun Qin,et al.  A Comparison Study of Saliency Models for Fixation Prediction on Infants and Adults , 2018, IEEE Transactions on Cognitive and Developmental Systems.

[47]  Ioannis Agtzidis,et al.  Free visual exploration of natural movies in schizophrenia , 2019, European Archives of Psychiatry and Clinical Neuroscience.

[48]  Jie Yang,et al.  Normalized cut-based saliency detection by adaptive multi-level region merging , 2015, IEEE Transactions on Image Processing.

[49]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Qingming Huang,et al.  Global Context-Aware Progressive Aggregation Network for Salient Object Detection , 2020, AAAI.

[51]  Johannes Hewig,et al.  Gender Differences for Specific Body Regions When Looking at Men and Women , 2008 .

[52]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[53]  Lihi Zelnik-Manor,et al.  How to Evaluate Foreground Maps , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Xinjian Chen,et al.  Gaze2Segment: A Pilot Study for Integrating Eye-Tracking Technology into Medical Image Segmentation , 2016, MCV/BAMBI@MICCAI.

[55]  Yang Hu,et al.  A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation , 2018, Neural Networks.

[56]  Ning Xu,et al.  Deep Interactive Object Selection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Linwei Ye,et al.  Cross-Modal Weighting Network for RGB-D Salient Object Detection , 2020, ECCV.

[58]  Guillermo Sapiro,et al.  Geodesic Matting: A Framework for Fast Interactive Image and Video Segmentation and Matting , 2009, International Journal of Computer Vision.

[59]  Shuo Wang,et al.  Predicting human gaze beyond pixels. , 2014, Journal of vision.

[60]  Mei-Ling Shyu,et al.  Integrating Image and Textual Information in Human–Robot Interactions for Children With Autism Spectrum Disorder , 2019, IEEE Transactions on Multimedia.

[61]  Xiaojin Gong,et al.  Adaptive Fusion for RGB-D Salient Object Detection , 2019, IEEE Access.

[62]  Meng Jian,et al.  Interactive Image Segmentation Using Adaptive Constraint Propagation , 2016, IEEE Transactions on Image Processing.

[63]  Radoslav Vargic,et al.  Detection of Schizophrenia Spectrum Disorders Using Saliency Maps , 2017, 2017 IEEE 11th International Conference on Application of Information and Communication Technologies (AICT).

[64]  Rita Cucchiara,et al.  Predicting Human Eye Fixations via an LSTM-Based Saliency Attentive Model , 2016, IEEE Transactions on Image Processing.

[65]  Jian Yang,et al.  Probabilistic Diffusion for Interactive Image Segmentation , 2019, IEEE Transactions on Image Processing.

[66]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Bo Ren,et al.  Enhanced-alignment Measure for Binary Foreground Map Evaluation , 2018, IJCAI.

[68]  Cheolkon Jung,et al.  Point-cut: Fixation point-based image segmentation using random walk model , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[69]  Wei Liu,et al.  Improving Video Saliency Detection via Localized Estimation and Spatiotemporal Refinement , 2018, IEEE Transactions on Multimedia.

[70]  Kaiqi Huang,et al.  Focal Boundary Guided Salient Object Detection , 2019, IEEE Transactions on Image Processing.

[71]  Qijun Zhao,et al.  Siamese Network for RGB-D Salient Object Detection and Beyond , 2020, ArXiv.

[72]  Haibin Ling,et al.  Revisiting Video Saliency Prediction in the Deep Learning Era , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  Haibin Ling,et al.  ICNet: Information Conversion Network for RGB-D Based Salient Object Detection , 2020, IEEE Transactions on Image Processing.

[74]  Gert Kootstra,et al.  Using Symmetry to Select Fixation Points for Segmentation , 2010, 2010 20th International Conference on Pattern Recognition.

[75]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[77]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[78]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[79]  Wenguan Wang,et al.  Shifting More Attention to Video Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[81]  Dong Liu,et al.  High-Resolution Representations for Labeling Pixels and Regions , 2019, ArXiv.