暂无分享,去创建一个
Yuting Gao | Ke Li | Jinpeng Wang | Yiqi Lin | Andy J. Ma | Xing Sun | A. J. Ma | Xing Sun | Yuting Gao | Yiqi Lin | Ke Li | Jinpeng Wang
[1] Fabio Viola,et al. The Kinetics Human Action Video Dataset , 2017, ArXiv.
[2] William T. Freeman,et al. SpeedNet: Learning the Speediness in Videos , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[3] Heng Wang,et al. Video Modeling With Correlation Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Yang Wang,et al. Pulling Actions out of Context: Explicit Separation for Effective Combination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[5] Sergio Guadarrama,et al. Tracking Emerges by Colorizing Videos , 2018, ECCV.
[6] Yi Li,et al. RESOUND: Towards Action Recognition Without Representation Bias , 2018, ECCV.
[7] Wei Liu,et al. Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.
[9] Limin Wang,et al. Temporal Action Detection with Structured Segment Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[10] Yutaka Satoh,et al. Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[11] Andrew Zisserman,et al. Learning and Using the Arrow of Time , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[13] Andrew Zisserman,et al. Video Representation Learning by Dense Predictive Coding , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[14] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[15] Toshihiko Yamasaki,et al. Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework , 2020, ACM Multimedia.
[16] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[17] Yann LeCun,et al. Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[18] Yann LeCun,et al. A Closer Look at Spatiotemporal Convolutions for Action Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[19] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[20] Ming-Hsuan Yang,et al. Unsupervised Representation Learning by Sorting Sequences , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[21] Weiping Wang,et al. Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning , 2020, AAAI.
[22] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[23] Martial Hebert,et al. Shuffle and Learn: Unsupervised Learning Using Temporal Order Verification , 2016, ECCV.
[24] Longlong Jing,et al. Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction. , 2018, 1811.11387.
[25] Stella X. Yu,et al. Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[26] Jiebo Luo,et al. AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations Rather Than Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Paolo Favaro,et al. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.
[28] James W. Davis,et al. The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..
[29] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[31] Allan Jabri,et al. Learning Correspondence From the Cycle-Consistency of Time , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Yueting Zhuang,et al. Self-Supervised Spatiotemporal Learning via Video Clip Order Prediction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Kaiming He,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Jianbo Jiao,et al. Self-supervised Video Representation Learning by Pace Prediction , 2020, ECCV.
[35] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[36] In-So Kweon,et al. Self-Supervised Video Representation Learning with Space-Time Cubic Puzzles , 2018, AAAI.
[37] Deva Ramanan,et al. CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning , 2020, ICLR.
[38] Cordelia Schmid,et al. Learning to Track for Spatio-Temporal Action Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[39] Michael S. Ryoo,et al. Evolving Losses for Unsupervised Video Representation Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Björn Ommer,et al. Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning , 2018, ECCV.
[41] Thomas Brox,et al. Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[42] Bolei Zhou,et al. Temporal Relational Reasoning in Videos , 2017, ECCV.
[43] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Yiannis Kompatsiaris,et al. Robust Temporal Activity Templates Using Higher Order Statistics , 2009, IEEE Transactions on Image Processing.
[45] Chen Gao,et al. Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition , 2019, NeurIPS.
[46] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[47] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[48] Nikos Komodakis,et al. Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.
[49] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[50] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[51] Shih-Fu Chang,et al. Unsupervised Embedding Learning via Invariant and Spreading Instance Feature , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[52] Limin Wang,et al. Temporal Segment Networks for Action Recognition in Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[53] Sergio Escalera,et al. Gate-Shift Networks for Video Action Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Andrew Zisserman,et al. Self-supervised Co-training for Video Representation Learning , 2020, NeurIPS.
[55] Leonidas J. Guibas,et al. Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[56] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[57] Ronen Basri,et al. Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[58] Andrew Zisserman,et al. Memory-augmented Dense Predictive Coding for Video Representation Learning , 2020, ECCV.
[59] Phillip Isola,et al. Contrastive Multiview Coding , 2019, ECCV.
[60] Andrew Zisserman,et al. End-to-End Learning of Visual Representations From Uncurated Instructional Videos , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[61] Shanmuganathan Raman,et al. Attentive Spatio-Temporal Representation Learning for Diving Classification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[62] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.
[63] Jianbo Shi,et al. Learning Discriminative Motion Features Through Detection , 2018, ArXiv.
[64] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[65] Alan Yuille,et al. Grouped Spatial-Temporal Aggregation for Efficient Action Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[66] Luc Van Gool,et al. DynamoNet: Dynamic Action and Motion Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[67] Yu Zhou,et al. Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).