Attribute Driven Zero-Shot Classification and Segmentation

Zero-shot classification and segmentation aims to recognize and segment objects of unseen classes. The attribute information, such as color, shape, part and material, is usually used for zero-shot classification. Moreover, we observe that this kind of attribute information could also be helpful in the segmentation task. On this basis, we propose an Attribute-Segmentation-Attribute (ASA) framework to address the zero-shot classification and segmentation problem. In the framework, a multi-task model is pre-trained to capture category and attribute features simultaneously. Then, a two-branch fully convolutional structure is built on the pre-trained model and fine-tuned for segmentation task. Finally, the extracted class-unseen object is recognized with the segmentation-assisted attribute prediction and a class-attribute matrix. Experimental results on the public bench-mark datasets indicate that the proposed ASA framework out-performs the state-of-the-art methods for both classification and segmentation tasks.

[1]  Jitendra Malik,et al.  Simultaneous Detection and Segmentation , 2014, ECCV.

[2]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Yi Li,et al.  Fully Convolutional Instance-Aware Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Carsten Rother,et al.  Dense Semantic Image Segmentation with Objects and Attributes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[7]  Cordelia Schmid,et al.  Label-Embedding for Attribute-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Hong Zhang,et al.  Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  James Hays,et al.  COCO Attributes: Attributes for People, Animals, and Objects , 2016, ECCV.

[10]  Ming-Hsuan Yang,et al.  Multi-instance object segmentation with occlusion handling , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Jitendra Malik,et al.  Iterative Instance Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Tao Xiang,et al.  Weakly-Supervised Image Annotation and Segmentation with Objects and Attributes , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Cees Snoek,et al.  Attributes Make Sense on Segmented Objects , 2014, ECCV.

[14]  Bernt Schiele,et al.  Multi-cue Zero-Shot Learning with Strong Supervision , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[16]  Jian Sun,et al.  Instance-Aware Semantic Segmentation via Multi-task Network Cascades , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Anton van den Hengel,et al.  Less is More: Zero-Shot Learning from Online Textual Documents with Noise Suppression , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[19]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).