论文信息 - SwiDeN: Convolutional Neural Networks For Depiction Invariant Object Recognition

SwiDeN: Convolutional Neural Networks For Depiction Invariant Object Recognition

Current state of the art object recognition architectures achieve impressive performance but are typically specialized for a single depictive style (e.g. photos only, sketches only). In this paper, we present SwiDeN: our Convolutional Neural Network (CNN) architecture which recognizes objects regardless of how they are visually depicted (line drawing, realistic shaded drawing, photograph etc.). In SwiDeN, we utilize a novel `deep' depictive style-based switching mechanism which appropriately addresses the depiction-specific and depiction-invariant aspects of the problem. We compare SwiDeN with alternative architectures and prior work on a 50-category Photo-Art dataset containing objects depicted in multiple styles. Experimental results show that SwiDeN outperforms other approaches for the depiction-invariant object recognition problem.

Ravi Kiran Sarvadevabhatla | Shiv Surya | Srinivas S. Kruthiventi | R. Venkatesh Babu

[1] Tao Xiang,et al. Sketch-a-Net that Beats Humans , 2015, BMVC.

[2] CipollaRoberto,et al. Multiscale Categorical Object Recognition Using Contour Fragments , 2008 .

[3] Hongping Cai,et al. Learning Graphs to Model Visual Objects across Different Depictive Styles , 2014, ECCV.

[4] Andrew Blake,et al. Multiscale Categorical Object Recognition Using Contour Fragments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Pietro Perona,et al. Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7] Shimon Ullman,et al. Atoms of recognition in human and computer vision , 2016, Proceedings of the National Academy of Sciences.

[8] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[9] Qi Wu,et al. Beyond Photo-Domain Object Recognition: Benchmarks for the Cross-Depiction Problem , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[10] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[11] Qiang Chen,et al. Network In Network , 2013, ICLR.

[12] Xiao Bai,et al. Learning invariant structure for object identification by using graph methods , 2011, Comput. Vis. Image Underst..

[13] Alexei A. Efros,et al. Data-driven visual similarity for cross-domain image matching , 2011, ACM Trans. Graph..

[14] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[15] Hongping Cai,et al. The Cross-Depiction Problem: Computer Vision Algorithms for Recognising Objects in Artwork and in Photographs , 2015, ArXiv.

[16] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17] Sanja Fidler,et al. Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.