相关论文

Instance-Aware Semantic Segmentation via Multi-task Network Cascades

Abstract:Semantic segmentation research has recently witnessed rapid progress, but many leading methods are unable to identify object instances. In this paper, we present Multitask Network Cascades for instance-aware semantic segmentation. Our model consists of three networks, respectively differentiating instances, estimating masks, and categorizing objects. These networks form a cascaded structure, and are designed to share their convolutional features. We develop an algorithm for the nontrivial end-to-end training of this causal, cascaded structure. Our solution is a clean, single-step training framework and can be generalized to cascades that have more stages. We demonstrate state-of-the-art instance-aware semantic segmentation accuracy on PASCAL VOC. Meanwhile, our method takes only 360ms testing an image using VGG-16, which is two orders of magnitude faster than previous systems for this challenging problem. As a by product, our method also achieves compelling object detection results which surpass the competitive Fast/Faster R-CNN systems. The method described in this paper is the foundation of our submissions to the MS COCO 2015 segmentation competition, where we won the 1st place.

参考文献

[1]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[2]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[3]  Subhransu Maji,et al.  Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[4]  Cristian Sminchisescu,et al.  Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Cristian Sminchisescu,et al.  CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[8]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[9]  Jürgen Schmidhuber,et al.  Compete to Compute , 2013, NIPS.

[10]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Neural Networks , 2013 .

[11]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Jitendra Malik,et al.  Simultaneous Detection and Segmentation , 2014, ECCV.

[13]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[14]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[15]  Jonathan T. Barron,et al.  Multiscale Combinatorial Grouping , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[17]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[19]  Jian Sun,et al.  Convolutional feature masking for joint object and stuff segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Ronan Collobert,et al.  Learning to Segment Object Candidates , 2015, NIPS.

[23]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[24]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Nikos Komodakis,et al.  Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  George Papandreou,et al.  Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation , 2015, ArXiv.

[28]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[29]  Jian Sun,et al.  BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

引用
Real-time Factored ConvNets: Extracting the X Factor in Human Parsing
BMVC
2017
Controlling the Transport Defects of Power Generating Solar Panels
2019 IEEE 39th International Conference on Electronics and Nanotechnology (ELNANO)
2019
Object-level image segmentation with prior information
2019
Instance Segmentation Based on Superpixel Module and Attention Module
2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA)
2020
Aerial Imagery for Roof Segmentation: A Large-Scale Dataset towards Automatic Mapping of Buildings
ArXiv
2018
Attribute Driven Zero-Shot Classification and Segmentation
2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
2018
Exploring Flood Filling Networks for Instance Segmentation of XXL-Volumetric and Bulk Material CT Data
Journal of Nondestructive Evaluation
2020
Intelligent monitoring of indoor surveillance video based on deep learning
2019 21st International Conference on Advanced Communication Technology (ICACT)
2019
A review of object detection based on deep learning
Multimedia Tools and Applications
2020
Deep Cross-Domain Fashion Recommendation
RecSys
2017
Research on the Application of Instance Segmentation Algorithm in the Counting of Metro Waiting Population
ICGEC
2019
Attention Receptive Pyramid Network for Ship Detection in SAR Images
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
2020
Universal representations: The missing link between faces, text, planktons, and cat breeds
ArXiv
2017
360-Indoor: Towards Learning Real-World Objects in 360° Indoor Equirectangular Images
2020 IEEE Winter Conference on Applications of Computer Vision (WACV)
2019
Learning Region Features for Object Detection
ECCV
2018
MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving
2018 IEEE Intelligent Vehicles Symposium (IV)
2016
Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation
2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019)
2019
Fine-Grained Recognition in the Wild: A Multi-task Domain Adaptation Approach
2017 IEEE International Conference on Computer Vision (ICCV)
2017
End-to-End Instance Segmentation and Counting with Recurrent Attention
ArXiv
2016
End-to-End Instance Segmentation with Recurrent Attention
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
2016