Visual Content Recognition by Exploiting Semantic Feature Map with Attention and Multi-task Learning
暂无分享,去创建一个
Qi Zhang | Yu-Gang Jiang | Jianguo Li | Zuxuan Wu | Rui-Wei Zhao | Qi Zhang | Yu-Gang Jiang | Jianguo Li | Zuxuan Wu | Rui Zhao
[1] Wei Xu,et al. CNN-RNN: A Unified Framework for Multi-label Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Gang Wang,et al. Learning Contextual Dependencies with Convolutional Hierarchical Recurrent Neural Networks , 2015, ArXiv.
[3] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.
[4] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[5] Bingbing Ni,et al. HCP: A Flexible CNN Framework for Multi-Label Image Classification , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Dong Liu,et al. Discovering joint audio–visual codewords for video event detection , 2013, Machine Vision and Applications.
[7] Shih-Fu Chang,et al. Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.
[8] Xi Wang,et al. Multi-Stream Multi-Class Fusion of Deep Networks for Video Classification , 2016, ACM Multimedia.
[9] C. Lawrence Zitnick,et al. Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.
[10] Yu-Gang Jiang,et al. Harnessing Object and Scene Semantics for Large-Scale Video Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Chong-Wah Ngo,et al. Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.
[12] Chong-Wah Ngo,et al. Opinion Question Answering by Sentiment Clip Localization , 2015, ACM Trans. Multim. Comput. Commun. Appl..
[13] G M SnoekCees,et al. No spare parts , 2016 .
[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Nitish Srivastava,et al. Exploiting Image-trained CNN Architectures for Unconstrained Video Classification , 2015, BMVC.
[16] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[17] Pong C. Yuen,et al. Reduced Analytic Dependency Modeling: Robust Fusion for Visual Recognition , 2014, International Journal of Computer Vision.
[18] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.
[19] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[20] Ming-Syan Chen,et al. Video Event Detection by Inferring Temporal Instance Labels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[21] Chong-Wah Ngo,et al. Human Action Recognition in Unconstrained Videos by Explicit Motion Modeling , 2015, IEEE Transactions on Image Processing.
[22] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[23] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Jian Dong,et al. Subcategory-Aware Object Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[25] Xiangyang Xue,et al. Regional Gating Neural Networks for Multi-label Image Classification , 2016, BMVC.
[26] Cees Snoek,et al. UvA-DARE ( Digital Academic Repository ) Event Fisher Vectors : Robust Encoding Visual Diversity of Visual Streams , 2015 .
[27] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[28] Hao Su,et al. Object Bank: An Object-Level Image Representation for High-Level Visual Recognition , 2014, International Journal of Computer Vision.
[29] Ivan Laptev,et al. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[30] Xi Wang,et al. Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification , 2015, ACM Multimedia.
[31] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[32] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[33] Jianfei Cai,et al. Can Partial Strong Labels Boost Multi-label Object Recognition? , 2015, ArXiv.
[34] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[35] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[36] Tao Mei,et al. Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[37] Yuxin Peng,et al. The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Changsheng Xu,et al. Semantic Feature Mining for Video Event Understanding , 2016, ACM Trans. Multim. Comput. Commun. Appl..
[39] Xi Wang,et al. Exploiting Objects with LSTMs for Video Categorization , 2016, ACM Multimedia.
[40] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.
[41] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[43] Nicu Sebe,et al. Feature Weighting via Optimal Thresholding for Video Analysis , 2013, 2013 IEEE International Conference on Computer Vision.
[44] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.
[45] Meng Wang,et al. Beyond Object Proposals: Random Crop Pooling for Multi-Label Image Recognition , 2016, IEEE Transactions on Image Processing.
[46] Dong Liu,et al. Sample-Specific Late Fusion for Visual Category Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[47] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[48] Dong Liu,et al. Robust late fusion with rank minimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[49] Shih-Fu Chang,et al. Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[50] Jun Wang,et al. Deep Attributes from Context-Aware Regional Neural Codes , 2015, ArXiv.
[51] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Bo Zhao,et al. Diversified Visual Attention Networks for Fine-Grained Object Classification , 2016, IEEE Transactions on Multimedia.
[53] Yizhou Yu,et al. Harvesting Discriminative Meta Objects with Deep CNN Features for Scene Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[54] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[55] Svetlana Lazebnik,et al. Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.
[56] Yu-Gang Jiang,et al. Learning Semantic Feature Map for Visual Content Recognition , 2017, ACM Multimedia.
[57] Cees Snoek,et al. No spare parts: Sharing part detectors for image categorization , 2015, Comput. Vis. Image Underst..
[58] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[59] Ruslan Salakhutdinov,et al. Action Recognition using Visual Attention , 2015, NIPS 2015.
[60] Gang Wang,et al. Learning Contextual Dependence With Convolutional Hierarchical Recurrent Neural Networks , 2015, IEEE Transactions on Image Processing.
[61] Philip H. S. Torr,et al. BING: Binarized normed gradients for objectness estimation at 300fps , 2014, Computational Visual Media.
[62] Kavita Bala,et al. Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[63] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.