Learning Semantic Concepts from Weakly Labeled Data

As in many areas with deep learning methods, great success is achieved in object and scene recognition, but for a good result a large number of labeled data is needed. Our aim in this study is to eliminate the need for tagged data collection that requires too much human labor, and to use the abundant images as data set, paired with the object name on the Internet and social media. However, this data is not as clean as manually labeled data, and it does not work well when used directly. In this study, Association with Model Evolution (AME) method is adapted to eliminate noisy data. Data that were automatically collected and cleaned with AME were then used as the experimental set for Convolutional Neural Networks (CNN). It is observed that the performance is increased by 4% with using the AME cleaned data.

[1]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Pinar Duygulu Sahin,et al.  FAME: Face Association through Model Evolution , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).