暂无分享,去创建一个
Seon Joo Kim | Jinwoo Kim | Taehyun Kim | Hyolim Kang | Seonhoon Kim | Hyolim Kang | Taehyun Kim | Jinwoo Kim | Seon Joo Kim
[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[2] D. Reisberg. The Oxford Handbook of Cognitive Psychology , 2013 .
[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Larry S. Davis,et al. Gait Recognition Using Image Self-Similarity , 2004, EURASIP J. Adv. Signal Process..
[5] Ce Liu,et al. Supervised Contrastive Learning , 2020, NeurIPS.
[6] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[7] Larry S. Davis,et al. EigenGait: Motion-Based Recognition of People Using Image Self-Similarity , 2001, AVBPA.
[8] Supplementary Material for: Time-Equivariant Contrastive Video Representation Learning , 2021 .
[9] Shilei Wen,et al. BMN: Boundary-Matching Network for Temporal Action Proposal Generation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[10] R. Nevatia,et al. TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[11] Ming Yang,et al. BSN: Boundary Sensitive Network for Temporal Action Proposal Generation , 2018, ECCV.
[12] Xinlei Chen,et al. Exploring Simple Siamese Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Jiaya Jia,et al. Parametric Contrastive Learning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[15] Antonis A. Argyros,et al. Unsupervised Detection of Periodic Segments in Videos , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).
[16] Jeffrey M. Zacks,et al. A Computational Model of Event Segmentation From Perceptual Prediction , 2007, Cogn. Sci..
[17] Michal Valko,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[18] Seong Jong Ha,et al. Zero-shot Natural Language Video Localization , 2021, ArXiv.
[19] Kaiming He,et al. Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.
[20] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[21] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[22] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[23] Jonathan Tompson,et al. Counting Out Time: Class Agnostic Video Repetition Counting in the Wild , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Hyolim Kang,et al. CAG-QIL: Context-Aware Actionness Grouping via Q Imitation Learning for Online Temporal Action Localization , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[25] Gregory D. Hager,et al. Segmental Spatiotemporal CNNs for Fine-Grained Action Segmentation , 2016, ECCV.
[26] Yongzhao Zhan,et al. A Survey on Temporal Action Localization , 2020, IEEE Access.
[27] Wei Liu,et al. VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Yiannis Kompatsiaris,et al. ViSiL: Fine-Grained Spatio-Temporal Video Similarity Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[29] Weiyao Wang,et al. Generic Event Boundary Detection: A Benchmark for Event Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Raffay Hamid,et al. Shot Contrastive Self-Supervised Learning for Scene Boundary Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Alexander Kolesnikov,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[32] Geoffrey E. Hinton,et al. Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.
[33] Bernard Ghanem,et al. DAPs: Deep Action Proposals for Action Understanding , 2016, ECCV.
[34] Sébastien Marcel,et al. Torchvision the machine-vision package of torch , 2010, ACM Multimedia.
[35] Jianfeng Dong,et al. Hierarchical Sequence Representation with Graph Network , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[36] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[37] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Bernard Ghanem,et al. SST: Single-Stream Temporal Action Proposals , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Jeffrey M. Zacks,et al. Segmentation in the perception and memory of events , 2008, Trends in Cognitive Sciences.
[40] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[41] Serge J. Belongie,et al. Spatiotemporal Contrastive Video Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).