Universal Prototype Transport for Zero-Shot Action Recognition and Localization

[1]  Rayson Laroca,et al.  Tell me what you see: A zero-shot action recognition method based on natural language descriptions , 2021, ArXiv.

[2]  Pascal Mettes,et al.  Zero-Shot Action Recognition from Diverse Object-Scene Compositions , 2021, BMVC.

[3]  Cees G. M. Snoek,et al.  Object Priors for Classifying and Localizing Unseen Actions , 2021, International Journal of Computer Vision.

[4]  Cordelia Schmid,et al.  ViViT: A Video Vision Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Francis E. H. Tay,et al.  Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Yangyang Xu,et al.  Transductive Zero-Shot Action Recognition via Visually Connected Graph Convolutional Networks , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Christopher De Sa,et al.  Differentiating through the Fréchet Mean , 2020, ICML.

[8]  Heng Tao Shen,et al.  Searching for Actions on the Hyperbole , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Hema A. Murthy,et al.  Zero-shot learning for action recognition using synthesized features , 2020, Neurocomputing.

[10]  Cees G. M. Snoek,et al.  Shuffled ImageNet Banks for Video Event Detection and Search , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[11]  Dima Damen,et al.  The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Pietro Perona,et al.  Rethinking Zero-Shot Video Classification: End-to-End Training for Realistic Applications , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Gregory D. Hager,et al.  DASZL: Dynamic Action Signatures for Zero-shot Learning , 2019, AAAI.

[14]  David Menotti,et al.  Zero-Shot Action Recognition in Videos: A Survey , 2019, Neurocomputing.

[15]  Ioannis Patras,et al.  TARN: Temporal Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition , 2019, BMVC.

[16]  Changsheng Xu,et al.  I Know the Relationships: Zero-Shot Action Recognition via Two-Stream Graph Convolutional Networks and Knowledge Graphs , 2019, AAAI.

[17]  Marco Cuturi,et al.  Computational Optimal Transport: With Applications to Data Science , 2019 .

[18]  Ling Shao,et al.  Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  James M. Rehg,et al.  Action2Vec: A Crossmodal Embedding Approach to Action Learning , 2019, ArXiv.

[20]  Jitendra Malik,et al.  SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Rainer Stiefelhagen,et al.  Towards a Fair Evaluation of Zero-Shot Action Recognition Using External Data , 2018, ECCV Workshops.

[22]  Xavier Pennec,et al.  geomstats: a Python Package for Riemannian Geometry in Machine Learning , 2018, ArXiv.

[23]  Ling Shao,et al.  Towards Universal Representation for Unseen Action Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Bolei Zhou,et al.  Moments in Time Dataset: One Million Videos for Event Understanding , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Yann LeCun,et al.  A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Huadong Ma,et al.  Generalized zero-shot learning for action recognition with web-scale video data , 2017, World Wide Web.

[27]  Cees Snoek,et al.  Spatial-Aware Object Embeddings for Zero-Shot Localization and Classification of Actions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Andrew Zisserman,et al.  Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Shaogang Gong,et al.  Exploring synonyms as context in zero-shot action recognition , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[31]  Baoxin Li,et al.  Recognizing unseen actions in a domain-adapted embedding space , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[32]  Yu-Gang Jiang,et al.  Harnessing Object and Scene Semantics for Large-Scale Video Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Tianbao Yang,et al.  Learning Attributes Equals Multi-Source Domain Generalization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Deli Zhao,et al.  Recognizing an Action Using Its Name: A Knowledge-Based Approach , 2016, International Journal of Computer Vision.

[35]  Yi Yang,et al.  Concepts Not Alone: Exploring Pairwise Relationships for Zero-Shot Video Activity Recognition , 2016, AAAI.

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Shaogang Gong,et al.  Unsupervised Domain Adaptation for Zero-Shot Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Xun Xu,et al.  Transductive Zero-Shot Action Recognition by Word-Vector Embedding , 2015, International Journal of Computer Vision.

[39]  Cees G. M. Snoek,et al.  Objects2action: Classifying and Localizing Actions without Any Video Example , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Bernard Ghanem,et al.  ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Chunheng Wang,et al.  Robust relative attributes for human action recognition , 2015, Pattern Analysis and Applications.

[42]  Shaogang Gong,et al.  Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation , 2014, ECCV.

[43]  Bernt Schiele,et al.  Transfer Learning in a Transductive Setting , 2013, NIPS.

[44]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[45]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[46]  Wolfgang Heidrich,et al.  Displacement interpolation using Lagrangian mass transport , 2011, ACM Trans. Graph..

[47]  Silvio Savarese,et al.  Recognizing human actions by attributes , 2011, CVPR 2011.

[48]  Inderjit S. Dhillon,et al.  Clustering on the Unit Hypersphere using von Mises-Fisher Distributions , 2005, J. Mach. Learn. Res..

[49]  M. Shah,et al.  Reformulating Zero-shot Action Recognition for Multi-label Actions , 2021, NeurIPS.

[50]  Nicolas Courty,et al.  POT: Python Optimal Transport , 2021, J. Mach. Learn. Res..