PyTorchVideo: A Deep Learning Library for Video Understanding
暂无分享,去创建一个
Aaron B. Adcock | Ross B. Girshick | Wan-Yen Lo | Matt Feiszli | Christoph Feichtenhofer | Ross Girshick | Yanghao Li | Jitendra Malik | Haoqi Fan | Kalyan Vasudev Alwala | Tullie Murrell | Heng Wang | Meng Li | Haichuan Yang | Nikhila Ravi | Yilei Li | Bo Xiong | Aaron Adcock | J. Malik | Haoqi Fan | Heng Wang | Wan-Yen Lo | Christoph Feichtenhofer | Yanghao Li | Matt Feiszli | Haichuan Yang | Tullie Murrell | Nikhila Ravi | Bo Xiong | Yilei Li | Meng Li
[1] Dima Damen,et al. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.
[2] Christoph Feichtenhofer,et al. Multiscale Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[3] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[4] Kaiming He,et al. A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[8] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[9] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[10] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[11] Heng Wang,et al. Video Classification With Channel-Separated Convolutional Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[12] Yong Jae Lee,et al. Audiovisual SlowFast Networks for Video Recognition , 2020, ArXiv.
[13] Mario F. M. Campos,et al. A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[14] Cordelia Schmid,et al. AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[15] Quoc V. Le,et al. Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[16] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] He He,et al. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing , 2020, J. Mach. Learn. Res..
[18] Christoph Feichtenhofer,et al. X3D: Expanding Architectures for Efficient Video Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Balaji Lakshminarayanan,et al. AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty , 2020, ICLR.
[20] Susanne Westphal,et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[21] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.