论文信息 - Growing Interpretable Part Graphs on ConvNets via Multi-Shot Learning

Growing Interpretable Part Graphs on ConvNets via Multi-Shot Learning

This paper proposes a learning strategy that embeds object-part concepts into a pre-trained convolutional neural network (CNN), in an attempt to 1) explore explicit semantics hidden in CNN units and 2) gradually transform the pre-trained CNN into a semantically interpretable graphical model for hierarchical object understanding. Given part annotations on very few (e.g., 3-12) objects, our method mines certain latent patterns from the pre-trained CNN and associates them with different semantic parts. We use a four-layer And-Or graph to organize the CNN units, so as to clarify their internal semantic hierarchy. Our method is guided by a small number of part annotations, and it achieves superior part-localization performance (about 13%-107% improvement in part center prediction on the PASCAL VOC and ImageNet datasets)

[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2] Trevor Darrell,et al. Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Yao Lu. Unsupervised Learning on Neural Network Outputs , 2015, ArXiv.

[5] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Wenze Hu,et al. Modeling Occlusion by Discriminative AND-OR Structures , 2013, 2013 IEEE International Conference on Computer Vision.

[7] Sanja Fidler,et al. Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[9] Andrea Vedaldi,et al. Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Mathieu Aubry,et al. Understanding Deep Features with Computer-Generated Imagery , 2015, ICCV.

[11] Thomas Brox,et al. Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[12] Ivan Laptev,et al. Is object localization for free? - Weakly-supervised learning with convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Song-Chun Zhu,et al. Learning AND-OR Templates for Object Recognition and Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Zhuowen Tu,et al. What Happened to My Dog in That Network: Unraveling Top-down Generators in Convolutional Neural Networks , 2015, ArXiv.

[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16] Marcel Simon,et al. Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17] Yao Lu. Unsupervised Learning on Neural Network Outputs: With Application in Zero-Shot Learning , 2016, IJCAI.

[18] Yifei Lu,et al. Max Margin AND/OR Graph learning for parsing the human body , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Pietro Perona,et al. Strong supervision from weak annotation: Interactive training of deformable part models , 2011, 2011 International Conference on Computer Vision.

[20] James L. McClelland,et al. What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated , 2016, Trends in Cognitive Sciences.

[21] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[22] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[23] Joachim Denzler,et al. Part Detector Discovery in Deep Convolutional Neural Networks , 2014, ACCV.

[24] Ivan Laptev,et al. Object Detection Using Strongly-Supervised Deformable Part Models , 2012, ECCV.

[25] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[26] S. Tsogkas,et al. Deep Learning for Semantic Part Segmentation with High-Level Guidance , 2015 .

[27] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[28] Bolei Zhou,et al. Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[29] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[30] Anton van den Hengel,et al. The treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Thomas Brox,et al. Inverting Visual Representations with Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[34] Quanshi Zhang,et al. Mining And-Or Graphs for Graph Matching and Object Discovery , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35] David A. McAllester,et al. Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[37] Xuan Song,et al. Object Discovery: Soft Attributed Graph Mining , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] Xuan Song,et al. Attributed Graph Mining and Matching: An Attempt to Define and Extract Soft Attributed Patterns , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.