暂无分享,去创建一个
Mubarak Shah | Rohit Gupta | Ishan R. Dave | Ishan Dave | Mamshad Nayeem Rizve | M. Shah | Rohit Gupta | I. Dave
[1] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[2] Martial Hebert,et al. Shuffle and Learn: Unsupervised Learning Using Temporal Order Verification , 2016, ECCV.
[3] Daniel Yamins,et al. Unsupervised Learning From Video With Deep Neural Embeddings , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Irfan A. Essa,et al. Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).
[5] Wei Liu,et al. Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[6] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[7] Andrew Zisserman,et al. Memory-augmented Dense Predictive Coding for Video Representation Learning , 2020, ECCV.
[8] Hirokatsu Kataoka,et al. Learning Spatiotemporal 3D Convolution with Video Order Self-supervision , 2018, ECCV Workshops.
[9] Andrew Zisserman,et al. Self-supervised Co-training for Video Representation Learning , 2020, NeurIPS.
[10] Tao Mei,et al. SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning , 2020, ArXiv.
[11] Peyman Moghadam,et al. Temporally Coherent Embeddings for Self-Supervised Video Representation Learning , 2020, 2020 25th International Conference on Pattern Recognition (ICPR).
[12] Tao Xiang,et al. Self-Supervised Video Representation Learning with Constrained Spatiotemporal Jigsaw , 2021, IJCAI.
[13] Yi Cao,et al. Self-supervised video representation learning by maximizing mutual information , 2020, Signal Process. Image Commun..
[14] Chen Sun,et al. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification , 2017, ECCV.
[15] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Wonjun Hwang,et al. Self-Supervised Spatio-Temporal Representation Learning Using Variable Playback Speed Prediction , 2020, ArXiv.
[17] Wei Liu,et al. Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] R Devon Hjelm,et al. Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.
[19] R. Devon Hjelm,et al. Representation Learning with Video Deep InfoMax , 2020, ArXiv.
[20] Hang Zhao,et al. HACS: Human Action Clips and Segments Dataset for Recognition and Temporal Localization , 2017, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[21] William T. Freeman,et al. SpeedNet: Learning the Speediness in Videos , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Toshihiko Yamasaki,et al. Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework , 2020, ACM Multimedia.
[23] Serge J. Belongie,et al. Spatiotemporal Contrastive Video Representation Learning , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Yueting Zhuang,et al. Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Jianbo Jiao,et al. Self-supervised Video Representation Learning by Pace Prediction , 2020, ECCV.
[27] Paolo Favaro,et al. Video Representation Learning by Recognizing Temporal Transformations , 2020, ECCV.
[28] In-So Kweon,et al. Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles , 2018, AAAI.
[29] Yannis Kalantidis,et al. Hard Negative Mixing for Contrastive Learning , 2020, NeurIPS.
[30] Weiping Wang,et al. Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning , 2020, AAAI.
[31] Andrew Zisserman,et al. Learning and Using the Arrow of Time , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[33] Andrew Zisserman,et al. Video Representation Learning by Dense Predictive Coding , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[34] Andrew Zisserman,et al. End-to-End Learning of Visual Representations From Uncurated Instructional Videos , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[36] Martial Hebert,et al. Unsupervised Learning of Video Representations via Dense Trajectory Clustering , 2020, ECCV Workshops.
[37] Efstratios Gavves,et al. Self-Supervised Video Representation Learning with Odd-One-Out Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Yi Li,et al. RESOUND: Towards Action Recognition Without Representation Bias , 2018, ECCV.
[39] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[40] Ming-Hsuan Yang,et al. Unsupervised Representation Learning by Sorting Sequences , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[41] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[42] Laurens van der Maaten,et al. Self-Supervised Learning of Pretext-Invariant Representations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[44] Yutaka Satoh,et al. Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs? , 2020, ArXiv.
[45] Andrew Owens,et al. Self-Supervised Learning of Audio-Visual Objects from Video , 2020, ECCV.
[46] Yutaka Satoh,et al. Ground Truth : Presenting weather forecast Result : Presenting weather forecast Ground Truth : Bench Pressing Result : Bench Pressing Ground Truth : Salsa Dancing Result : Salsa Dancing Ground Truth : Slapping Result : , 2018 .
[47] Bolei Zhou,et al. Video Representation Learning with Visual Tempo Consistency , 2020, ArXiv.
[48] Chen Gao,et al. Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition , 2019, NeurIPS.
[49] Cordelia Schmid,et al. Learning Video Representations using Contrastive Bidirectional Transformer , 2019 .
[50] Samia Ainouz,et al. Temporal Contrastive Pretraining for Video Action Recognition , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[51] Apostol Natsev,et al. YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.
[52] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[53] Yu Zhou,et al. Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[55] Longlong Jing,et al. Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction. , 2018, 1811.11387.