SPFusionNet: Sketch Segmentation Using Multi-modal Data Fusion

The sketch segmentation problem remains largely unsolved because conventional methods are greatly challenged by the highly abstract appearances of freehand sketches and their numerous shape variations. In this work, we tackle such challenges by exploiting different modes of sketch data in a unified framework. Specifically, we propose a deep neural network SPFusionNet to capture the characteristic of sketch by fusing from its image and point set modes. The image modal component SketchNet learns hierarchically abstract ro-bust features and utilizes multi-level representations to produce pixel-wise feature maps, while the point set-modal component SPointNet captures local and global contexts of the sampled point set to produce point-wise feature maps. Then our framework aggregates these feature maps by a fusion network component to generate the sketch segmentation result. The extensive experimental evaluation and comparison with peer methods on our large SketchSeg dataset verify the effectiveness of the proposed framework.

[1]  Tinne Tuytelaars,et al.  Example-Based Sketch Segmentation and Labeling Using CRFs , 2016, ACM Trans. Graph..

[2]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Eugenio Culurciello,et al.  LinkNet: Exploiting encoder representations for efficient semantic segmentation , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[4]  Jitendra Malik,et al.  Efficient shape matching using shape contexts , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Leonidas J. Guibas,et al.  SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Fei Wang,et al.  A Data‐Driven Approach for Sketch‐Based 3D Shape Retrieval via Similar Drawing‐Style Recommendation , 2017, Comput. Graph. Forum.

[8]  S. Dwivedi,et al.  Obesity May Be Bad: Compressed Convolutional Networks for Biomedical Image Segmentation , 2020 .

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Rynson W. H. Lau,et al.  Data-driven segmentation and labeling of freehand sketches , 2014, ACM Trans. Graph..

[12]  Shi-Min Hu,et al.  Sketch2Scene: sketch-based co-retrieval and co-placement of 3D models , 2013, ACM Trans. Graph..

[13]  Xiangjian He,et al.  Multi-view pairwise relationship learning for sketch based 3D shape retrieval , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[14]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  David Gur,et al.  Automated freehand sketch segmentation using radial basis functions , 2009, Comput. Aided Des..

[16]  Shaogang Gong,et al.  Free-Hand Sketch Synthesis with Deformable Stroke Models , 2016, International Journal of Computer Vision.

[17]  Lei Li Fast Sketch Segmentation and Labeling with Deep Learning , 2018 .

[18]  Hanhui Li,et al.  Multi-column point-CNN for sketch segmentation , 2018, Neurocomputing.

[19]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[20]  Tao Xiang,et al.  Sketch-a-Net: A Deep Neural Network that Beats Humans , 2017, International Journal of Computer Vision.

[21]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[23]  Alexey Shvets,et al.  TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation , 2018, Computer-Aided Analysis of Gastrointestinal Videos.