Scaling Human-Object Interaction Recognition in the Video through Zero-Shot Learning
暂无分享,去创建一个
[1] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[2] Yin Li,et al. Compositional Learning for Human Object Interaction , 2018, ECCV.
[3] Venkatesh Saligrama,et al. Zero-Shot Learning via Joint Latent Similarity Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Karim Faez,et al. Zero-Shot Learning on Human-Object Interaction Recognition in Video , 2019, 2019 5th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS).
[5] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Samy Bengio,et al. Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.
[8] Chenliang Xu,et al. Can humans fly? Action understanding with multiple classes of actors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Ali Farhadi,et al. Asynchronous Temporal Fields for Action Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] J. Gibson. The Ecological Approach to Visual Perception , 1979 .
[11] E. Reed. The Ecological Approach to Visual Perception , 1989 .
[12] Basura Fernando,et al. Learning End-to-end Video Classification with Rank-Pooling , 2016, ICML.
[13] Vittorio Ferrari,et al. Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.
[14] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[15] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[16] Ivan Laptev,et al. Learning person-object interactions for action recognition in still images , 2011, NIPS.
[17] Svetlana Lazebnik,et al. Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering , 2016, ECCV.
[18] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[19] Fei-Fei Li,et al. Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[20] Steven Bird,et al. NLTK: The Natural Language Toolkit , 2002, ACL.
[21] Xinlei Chen,et al. NEIL: Extracting Visual Knowledge from Web Data , 2013, 2013 IEEE International Conference on Computer Vision.
[22] Deva Ramanan,et al. Detecting Actions, Poses, and Objects with Relational Phraselets , 2012, ECCV.
[23] Koen E. A. van de Sande,et al. Selective Search for Object Recognition , 2013, International Journal of Computer Vision.
[24] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.
[25] Cees Snoek,et al. Objects2action: Classifying and Localizing Actions without Any Video Example , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[26] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[27] Qiang Li,et al. Action detection based on tracklets with the two-stream CNN , 2018, Multimedia Tools and Applications.
[28] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Shaogang Gong,et al. Zero-shot object recognition by semantic manifold distance , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Cewu Lu,et al. Transferable Interactiveness Knowledge for Human-Object Interaction Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Robert Bergevin,et al. Semantic human activity recognition: A literature review , 2015, Pattern Recognit..
[32] Ahad Harati,et al. A component-based video content representation for action recognition , 2019, Image Vis. Comput..
[33] Cordelia Schmid,et al. P-CNN: Pose-Based CNN Features for Action Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[34] Alexei A. Efros,et al. Scene Semantics from Long-Term Observation of People , 2012, ECCV.
[35] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[36] Abhinav Gupta,et al. ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Leonidas J. Guibas,et al. Human action recognition by learning bases of action attributes and parts , 2011, 2011 International Conference on Computer Vision.
[38] Chen Gao,et al. iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection , 2018, BMVC.
[39] Kristen Grauman,et al. Zero-shot recognition with unreliable attributes , 2014, NIPS.
[40] Shaogang Gong,et al. Semantic embedding space for zero-shot action recognition , 2015, 2015 IEEE International Conference on Image Processing (ICIP).
[41] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[42] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[44] Michael S. Bernstein,et al. Visual Relationship Detection with Language Priors , 2016, ECCV.
[45] Rainer Stiefelhagen,et al. How to Transfer? Zero-Shot Object Recognition via Hierarchical Transfer of Semantic Attributes , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.
[46] Silvio Savarese,et al. Recognizing human actions by attributes , 2011, CVPR 2011.
[47] Yann LeCun,et al. Convolutional Learning of Spatio-temporal Features , 2010, ECCV.
[48] Alexander J. Smola,et al. Compressed Video Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[49] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[50] Cordelia Schmid,et al. Multi-region Two-Stream R-CNN for Action Detection , 2016, ECCV.
[51] Xiaodong Yu,et al. Attribute-Based Transfer Learning for Object Categorization with Zero/One Training Example , 2010, ECCV.
[52] Cordelia Schmid,et al. Joint Learning of Object and Action Detectors , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[53] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[54] Bernard Ghanem,et al. On the relationship between visual attributes and convolutional networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Li Fei-Fei,et al. Scaling Human-Object Interaction Recognition Through Zero-Shot Learning , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[56] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[57] Andrew Zisserman,et al. What have We Learned from Deep Representations for Action Recognition? , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[58] Cees Snoek,et al. What do 15,000 object categories tell us about classifying and localizing actions? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Larry S. Davis,et al. Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[61] Tao Xiang,et al. Learning a Deep Embedding Model for Zero-Shot Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Thomas Brox,et al. High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.
[63] Cordelia Schmid,et al. Learning to Track for Spatio-Temporal Action Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[64] Luc Van Gool,et al. Actionness Estimation Using Hybrid Fully Convolutional Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[65] Frédéric Jurie,et al. Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classiffication , 2016, ECCV.
[66] Subhransu Maji,et al. Action recognition from a distributed representation of pose and appearance , 2011, CVPR 2011.
[67] L. Stark,et al. Dissertation Abstract , 1994, Journal of Cognitive Education and Psychology.
[68] Jiaxuan Wang,et al. HICO: A Benchmark for Recognizing Human-Object Interactions in Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[69] Christoph H. Lampert,et al. Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[70] Martial Hebert,et al. From Red Wine to Red Tomato: Composition with Context , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[71] Jia Deng,et al. Learning to Detect Human-Object Interactions , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[72] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[73] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[74] Kaiming He,et al. Detecting and Recognizing Human-Object Interactions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[75] Jitendra Malik,et al. Finding action tubes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).