论文信息 - Multi-Shot Mining Semantic Part Concepts in CNNs

Multi-Shot Mining Semantic Part Concepts in CNNs

This paper proposes a new learning strategy that incrementally embeds new object-part concepts into a pre-trained convolutional neural network (CNN), in order to 1) explore explicit semantics for the CNN units and 2) gradually transfer the pre-trained CNN into a “white-box” model for hierarchical object understanding. Given part annotations on a very small number (e.g. 3–12) of objects, our method mines certain units from the pre-trained CNN and associate them with different part concepts. We use a four-layer And-Or graph to organize the CNN units, which clarifies their internal semantic hierarchy. Our method is guided by a small number of part annotations, and it achieves superior part-localization performance (about 28%–107% improvement in part center prediction).

[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2] Pietro Perona,et al. Strong supervision from weak annotation: Interactive training of deformable part models , 2011, 2011 International Conference on Computer Vision.

[3] Wenze Hu,et al. Modeling Occlusion by Discriminative AND-OR Structures , 2013, 2013 IEEE International Conference on Computer Vision.

[4] Ivan Laptev,et al. Object Detection Using Strongly-Supervised Deformable Part Models , 2012, ECCV.

[5] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[6] James L. McClelland,et al. What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated , 2016, Trends in Cognitive Sciences.

[7] Sanja Fidler,et al. Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[9] Andrea Vedaldi,et al. Understanding deep image representations by inverting them , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[11] Mathieu Aubry,et al. Understanding Deep Features with Computer-Generated Imagery , 2015, ICCV.

[12] Bolei Zhou,et al. Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[13] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[14] David A. McAllester,et al. Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15] Zhuowen Tu,et al. What Happened to My Dog in That Network: Unraveling Top-down Generators in Convolutional Neural Networks , 2015, ArXiv.

[16] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[17] Joachim Denzler,et al. Part Detector Discovery in Deep Convolutional Neural Networks , 2014, ACCV.

[18] Yifei Lu,et al. Max Margin AND/OR Graph learning for parsing the human body , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19] S. Tsogkas,et al. Deep Learning for Semantic Part Segmentation with High-Level Guidance , 2015 .

[20] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[21] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[22] Yao Lu. Unsupervised Learning on Neural Network Outputs , 2015, ArXiv.

[23] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Thomas Brox,et al. Inverting Visual Representations with Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[27] Quanshi Zhang,et al. Mining And-Or Graphs for Graph Matching and Object Discovery , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28] Trevor Darrell,et al. Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29] Marcel Simon,et al. Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30] Anton van den Hengel,et al. The treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Thomas Brox,et al. Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[32] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[33] Ivan Laptev,et al. Is object localization for free? - Weakly-supervised learning with convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Song-Chun Zhu,et al. Learning AND-OR Templates for Object Recognition and Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.