Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models
暂无分享,去创建一个
J. V. Gemert | Yunhan Wang | Ombretta Strafforello | Robert-Jan Bruintjes | A. Lengyel | Jan Warchocki | Teodor Oprescu | Alexandru Damacus | Paul Misterka
[1] Tuan N. Tang,et al. TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization , 2023, ArXiv.
[2] Yujie Zhong,et al. TriDet: Temporal Action Detection with Relative Boundary Modeling , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Xiatian Zhu,et al. Zero-Shot Temporal Action Detection via Vision-Language Prompting , 2022, ECCV.
[4] Yin Li,et al. ActionFormer: Localizing Moments of Actions with Transformers , 2022, ECCV.
[5] Tao Xiang,et al. Few-Shot Temporal Action Localization with Query Adaptive Transformer , 2021, BMVC.
[6] Tongliang Liu,et al. KFC: An Efficient Framework for Semi-Supervised Temporal Action Localization , 2021, IEEE Transactions on Image Processing.
[7] Shiwei Zhang,et al. End-to-End Temporal Action Detection With Transformer , 2021, IEEE Transactions on Image Processing.
[8] Enhua Wu,et al. Transformer in Transformer , 2021, NeurIPS.
[9] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[10] Bernard Ghanem,et al. TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks , 2020, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).
[11] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[12] Yi Tay,et al. Efficient Transformers: A Survey , 2020, ACM Comput. Surv..
[13] Cees G. M. Snoek,et al. Localizing the Common Action Among a Few Videos , 2020, ECCV.
[14] K. Keutzer,et al. Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers , 2020, ICML.
[15] D. Damen,et al. Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100 , 2020, International Journal of Computer Vision.
[16] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[17] Runhao Zeng,et al. Graph Convolutional Networks for Temporal Action Localization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[18] Yadong Mu,et al. Scale Matters: Temporal Scale Aggregation Network For Precise Action Localization In Untrimmed Videos , 2019, 2020 IEEE International Conference on Multimedia and Expo (ICME).
[19] Shilei Wen,et al. BMN: Boundary-Matching Network for Temporal Action Proposal Generation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[20] Ali Razavi,et al. Data-Efficient Image Recognition with Contrastive Predictive Coding , 2019, ICML.
[21] David Berthelot,et al. MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.
[22] Ming Yang,et al. BSN: Boundary Sensitive Network for Temporal Action Proposal Generation , 2018, ECCV.
[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[24] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Luc Van Gool,et al. UntrimmedNets for Weakly Supervised Action Recognition and Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Haroon Idrees,et al. The THUMOS challenge on action recognition for videos "in the wild" , 2016, Comput. Vis. Image Underst..
[27] Li Fei-Fei,et al. Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos , 2015, International Journal of Computer Vision.
[28] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] A. Agrawal,et al. A survey on activity recognition and behavior understanding in video surveillance , 2013, The Visual Computer.
[30] Anupam Agrawal,et al. A survey on activity recognition and behavior understanding in video surveillance , 2012, The Visual Computer.
[31] Yong Jae Lee,et al. Discovering important people and objects for egocentric video summarization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[32] D. J. Swanson,et al. Videos , 1998, International Journal of Impotence Research.
[33] Yongzhao Zhan,et al. A Survey on Temporal Action Localization , 2020, IEEE Access.
[34] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[35] Wenpeng Yin,et al. Summarization , 2018, Encyclopedia of Database Systems.