Adaptive Pooling in Multi-instance Learning for Web Video Annotation
暂无分享,去创建一个
Dong Liu | Xiaoyan Sun | Wenjun Zeng | Zheng-Jun Zha | Yizhou Zhou | Dong Liu | Zhengjun Zha | Y. Zhou | Xiaoyan Sun | Wenjun Zeng
[1] James D. Keeler,et al. Integrated Segmentation and Recognition of Hand-Printed Numerals , 1990, NIPS.
[2] Thomas G. Dietterich,et al. Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..
[3] Jan Ramon,et al. Multi instance neural networks , 2000, ICML 2000.
[4] Yixin Chen,et al. Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..
[5] Paul A. Viola,et al. Multiple Instance Boosting for Object Detection , 2005, NIPS.
[6] Cees G. M. Snoek,et al. Early versus late fusion in semantic video analysis , 2005, MULTIMEDIA '05.
[7] Rong Yan,et al. Multiple instance learning for labeling faces in broadcasting news video , 2005, MULTIMEDIA '05.
[8] Tao Mei,et al. Correlative multi-label video annotation , 2007, ACM Multimedia.
[9] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[10] Meng Wang,et al. Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation , 2009, IEEE Transactions on Multimedia.
[11] Jean Ponce,et al. A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.
[12] Yongdong Zhang,et al. Web video retagging , 2011, Multimedia Tools and Applications.
[13] Yongdong Zhang,et al. Web video categorization based on Wikipedia categories and content-duplicated open resources , 2010, ACM Multimedia.
[14] Alexander Zien,et al. lp-Norm Multiple Kernel Learning , 2011, J. Mach. Learn. Res..
[15] Zi Huang,et al. Multiple feature hashing for real-time large scale near-duplicate video retrieval , 2011, ACM Multimedia.
[16] Hsin-Min Wang,et al. Automatic annotation of Web videos , 2011, 2011 IEEE International Conference on Multimedia and Expo.
[17] M. Kloft,et al. l p -Norm Multiple Kernel Learning , 2011 .
[18] Nicolas Le Roux,et al. Ask the locals: Multi-way local pooling for image recognition , 2011, 2011 International Conference on Computer Vision.
[19] Chong-Wah Ngo,et al. Fast Semantic Diffusion for Large-Scale Context-Based Image and Video Annotation , 2012, IEEE Transactions on Image Processing.
[20] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[21] Trevor Darrell,et al. Beyond spatial pyramids: Receptive field learning for pooled image features , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[22] Meng Wang,et al. Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification , 2012, IEEE Transactions on Multimedia.
[23] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[24] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[25] Rob Fergus,et al. Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.
[26] Mario Fritz,et al. Learning Smooth Pooling Regions for Visual Recognition , 2013, BMVC.
[27] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[28] Chong-Wah Ngo,et al. Annotation for free: video tagging by mining user search behavior , 2013, ACM Multimedia.
[29] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[30] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[31] Yi Yang,et al. How Related Exemplars Help Complex Event Detection in Web Videos? , 2013, 2013 IEEE International Conference on Computer Vision.
[32] Yong Jae Lee,et al. Weakly-supervised Discovery of Visual Pattern Configurations , 2014, NIPS.
[33] Razvan Pascanu,et al. Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks , 2013, ECML/PKDD.
[34] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[35] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[36] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[37] Iasonas Kokkinos,et al. Untangling Local and Global Deformations in Deep Convolutional Networks for Image Classification and Sliding Window Detection , 2014, ArXiv.
[38] Yue Gao,et al. Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss , 2014, IEEE Transactions on Multimedia.
[39] 汪萌,et al. Image Annotation By Multiple-Instance Learning With Discriminative Feature Mapping and Selection , 2014 .
[40] Geoffrey Zweig,et al. From captions to visual concepts and back , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Xi Wang,et al. Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification , 2015, ACM Multimedia.
[42] Ruslan Salakhutdinov,et al. Action Recognition using Visual Attention , 2015, NIPS 2015.
[43] G. K. Tam,et al. Event Fisher Vectors: Robust Encoding Visual Diversity of Visual Streams , 2015 .
[44] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[45] Ronan Collobert,et al. From image-level to pixel-level labeling with Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[47] Ivan Laptev,et al. Is object localization for free? - Weakly-supervised learning with convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Trevor Darrell,et al. Fully Convolutional Multi-Class Multiple Instance Learning , 2014, ICLR.
[50] Zhuowen Tu,et al. Generalizing Pooling Functions in Convolutional Neural Networks: Mixed, Gated, and Tree , 2015, AISTATS.
[51] Ali Farhadi,et al. Actions ~ Transformations , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Brendan J. Frey,et al. Classifying and segmenting microscopy images with deep multiple instance learning , 2015, Bioinform..
[53] Yu-Gang Jiang,et al. Harnessing Object and Scene Semantics for Large-Scale Video Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[55] Yi Yang,et al. Semantic Pooling for Complex Event Analysis in Untrimmed Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[56] Shih-Fu Chang,et al. Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[57] Shih-Fu Chang,et al. Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification , 2017, IEEE Transactions on Multimedia.
[58] Cees Snoek,et al. VideoLSTM convolves, attends and flows for action recognition , 2016, Comput. Vis. Image Underst..