Something-Else: Compositional Action Recognition With Spatial-Temporal Interaction Networks
暂无分享,去创建一个
[1] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[2] Xin Pan,et al. YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[4] Li Fei-Fei,et al. Neural Graph Matching Networks for Fewshot 3D Action Recognition , 2018, ECCV.
[5] Razvan Pascanu,et al. Visual Interaction Networks: Learning a Physics Simulator from Video , 2017, NIPS.
[6] Silvio Savarese,et al. Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Chuang Gan,et al. TSM: Temporal Shift Module for Efficient Video Understanding , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[8] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Trevor Darrell,et al. Spatio-Temporal Action Graph Networks , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[10] Richard P. Wildes,et al. Spatiotemporal Residual Networks for Video Action Recognition , 2016, NIPS.
[11] Martial Hebert,et al. From Red Wine to Red Tomato: Composition with Context , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Kaiming He,et al. Detecting and Recognizing Human-Object Interactions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[13] Juan Carlos Niebles,et al. Few-Shot Video Classification via Temporal Alignment , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[15] Joan Bruna,et al. Few-Shot Learning with Graph Neural Networks , 2017, ICLR.
[16] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[17] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Harold W. Kuhn,et al. The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.
[19] Kaiming He,et al. Long-Term Feature Banks for Detailed Video Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Yu-Chiang Frank Wang,et al. A Closer Look at Few-shot Classification , 2019, ICLR.
[21] Alexei A. Efros,et al. Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[22] Bolei Zhou,et al. Reasoning About Human-Object Interactions Through Dual Attention Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[23] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[24] Donald D. Hoffman,et al. Parts of recognition , 1984, Cognition.
[25] Fei-Fei Li,et al. Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[26] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[27] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[28] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.
[29] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[31] Yin Li,et al. Compositional Learning for Human Object Interaction , 2018, ECCV.
[32] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Joanna Materzynska,et al. The Jester Dataset: A Large-Scale Video Dataset of Human Gestures , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[34] Abhinav Gupta,et al. Videos as Space-Time Region Graphs , 2018, ECCV.
[35] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning For Video Understanding , 2017, ArXiv.
[36] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[37] David A. Forsyth,et al. Searching for Complex Human Activities with No Visual Examples , 2008, International Journal of Computer Vision.
[38] Michael S. Bernstein,et al. Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Song Han,et al. Temporal Shift Module for Efficient Video Understanding , 2018, ArXiv.
[40] Fabio Tozeto Ramos,et al. Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).
[41] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.
[42] Yichen Wei,et al. Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[43] Sudhakar Putheti,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2020, Information and Communication Technology for Intelligent Systems.
[44] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[45] Yi Yang,et al. Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Xiao Liu,et al. Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification , 2017, ArXiv.
[47] Leonidas J. Guibas,et al. Learning Shape Abstractions by Assembling Volumetric Primitives , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[49] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[50] Dan Klein,et al. Neural Module Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[51] Larry S. Davis,et al. Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[52] Daan Wierstra,et al. Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.
[53] Li Fei-Fei,et al. Inferring and Executing Programs for Visual Reasoning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[54] Cordelia Schmid,et al. AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[55] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.
[56] Kate Saenko,et al. A combined pose, object, and feature model for action understanding , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[57] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.
[58] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[59] Yong Jae Lee,et al. Mid-level Features Improve Recognition of Interactive Activities , 2012 .
[60] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.
[61] Cordelia Schmid,et al. Actor-Centric Relation Network , 2018, ECCV.
[62] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[63] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[64] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[65] Subhransu Maji,et al. Meta-Learning With Differentiable Convex Optimization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[67] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[68] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[69] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[70] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[71] Andrew Zisserman,et al. Video Action Transformer Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[72] Dima Damen,et al. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.
[73] Trevor Darrell,et al. Learning to Reason: End-to-End Module Networks for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[74] Shuicheng Yan,et al. Graph-Based Global Reasoning Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[75] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.
[76] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[77] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.