Self-supervising Action Recognition by Statistical Moment and Subspace Descriptors
暂无分享,去创建一个
[1] C. Schmid,et al. On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[2] Krystian Mikolajczyk,et al. Soft assignment of visual words as Linear Coordinate Coding and optimisation of its reconstruction error , 2011, 2011 18th IEEE International Conference on Image Processing.
[3] Cordelia Schmid,et al. Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.
[4] Anoop Cherian,et al. Higher-Order Pooling of CNN Features via Kernel Linearization for Action Recognition , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).
[5] Thomas B. Moeslund,et al. Selective spatio-temporal interest points , 2012, Comput. Vis. Image Underst..
[6] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[7] Luc Van Gool,et al. An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.
[8] Michael S. Ryoo,et al. AssembleNet++: Assembling Modality Representations via Attention Connections , 2020, ECCV.
[9] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[10] Gabriela Csurka,et al. Visual categorization with bags of keypoints , 2002, eccv 2004.
[11] Anoop Cherian,et al. Tensor Representations via Kernel Linearization for Action Recognition from 3D Skeletons , 2016, ECCV.
[12] Michael S. Ryoo,et al. AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures , 2019, ICLR.
[13] Ivan Laptev,et al. On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[14] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[16] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Richard P. Wildes,et al. Temporal Residual Networks for Dynamic Scene Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Leland McInnes,et al. UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..
[19] Thomas Brox,et al. Universität Des Saarlandes Fachrichtung 6.1 – Mathematik Highly Accurate Optic Flow Computation with Theoretically Justified Warping Highly Accurate Optic Flow Computation with Theoretically Justified Warping , 2022 .
[20] Can Zhang,et al. PAN: Persistent Appearance Network with an Efficient Motion Cue for Fast Action Recognition , 2019, ACM Multimedia.
[21] C. Bregler,et al. Large displacement optical flow , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[22] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[23] Yali Wang,et al. PA3D: Pose-Action 3D Machine for Video Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Ali Borji,et al. Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.
[25] Serge J. Belongie,et al. Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.
[26] Jitendra Malik,et al. SlowFast Networks for Video Recognition , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[27] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[28] Trang Nguyen,et al. Generalized Max Pooling for Action Recognition , 2015, 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE).
[29] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.
[31] Fatih Murat Porikli,et al. A Deeper Look at Power Normalizations , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Tong Wu,et al. Action Recognition with Bootstrapping based Long-range Temporal Context Attention , 2019, ACM Multimedia.
[33] Kilian Q. Weinberger,et al. Feature hashing for large scale multitask learning , 2009, ICML '09.
[34] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.
[35] DarrellTrevor,et al. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description , 2017 .
[36] Cordelia Schmid,et al. DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.
[37] Lei Wang,et al. In defense of soft-assignment coding , 2011, 2011 International Conference on Computer Vision.
[38] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[39] Lei Wang. Analysis and Evaluation of Kinect-based Action Recognition Algorithms , 2021, ArXiv.
[40] Lars Petersson,et al. Bilinear Attention Networks for Person Retrieval , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[41] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[42] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[43] Jian Sun,et al. Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[44] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[45] Heng Wang,et al. Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Christian Wolf,et al. Object Level Visual Reasoning in Videos , 2018, ECCV.
[47] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[48] Kaiming He,et al. Long-Term Feature Banks for Detailed Video Understanding , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Cordelia Schmid,et al. Long-Term Temporal Convolutions for Action Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[50] Richard P. Wildes,et al. A New Large Scale Dynamic Texture Dataset with Application to ConvNet Understanding , 2018, ECCV.
[51] Anoop Cherian,et al. Tensor Representations for Action Recognition , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[52] Zhuowen Tu,et al. Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Anoop Cherian,et al. Non-linear Temporal Subspace Representations for Activity Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[54] Cor J. Veenman,et al. Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[55] Piotr Koniusz,et al. Power Normalizations in Fine-Grained Image, Few-Shot Image and Graph Classification , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[56] Michael S. Ryoo,et al. Evolving Space-Time Neural Architectures for Videos , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[57] Bernt Schiele,et al. A database for fine grained activity detection of cooking activities , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[58] K. Mikolajczyk,et al. Higher-order Occurrence Pooling on Mid- and Low-level Features: Visual Concept Detection , 2013 .
[59] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[60] Marios Hadjieleftheriou,et al. Finding frequent items in data streams , 2008, Proc. VLDB Endow..
[61] Lior Wolf,et al. Local Trinary Patterns for human action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[62] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.
[63] Florent Perronnin,et al. Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[64] Cordelia Schmid,et al. EpicFlow: Edge-preserving interpolation of correspondences for optical flow , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[65] Jing Zhang,et al. Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[66] Jitendra Malik,et al. Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[67] William T. Freeman,et al. Orientation Histograms for Hand Gesture Recognition , 1995 .
[68] Nicu Sebe,et al. Realtime Video Classification using Dense HOF/HOG , 2014, ICMR.
[69] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[70] Tinne Tuytelaars,et al. Modeling video evolution for action recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[71] Nanning Zheng,et al. A Slow-I-Fast-P Architecture for Compressed Video Action Recognition , 2020, ACM Multimedia.
[72] Huchuan Lu,et al. Saliency Detection with Recurrent Fully Convolutional Networks , 2016, ECCV.
[73] Romain Dupont,et al. A General Dense Image Matching Framework Combining Direct and Feature-Based Costs , 2013, 2013 IEEE International Conference on Computer Vision.
[74] Dima Damen,et al. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset , 2018, ArXiv.
[75] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[76] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[77] HuXiaowei,et al. Salient Object Detection , 2017 .
[78] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[79] Jitendra Malik,et al. Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[80] Lianqiang Zhou,et al. Hallucinating Optical Flow Features for Video Classification , 2019, IJCAI.
[81] Cordelia Schmid,et al. PoTion: Pose MoTion Representation for Action Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[82] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[83] Anoop Cherian,et al. Learning Discriminative Video Representations Using Adversarial Perturbations , 2018, ECCV.
[84] Lei Wang,et al. Loss Switching Fusion with Similarity Search for Video Classification , 2019, 2019 IEEE International Conference on Image Processing (ICIP).
[85] Hui Wang,et al. Human Action Recognition Using Multi-Velocity STIPs and Motion Energy Orientation Histogram , 2014, J. Inf. Sci. Eng..
[86] Ming Shao,et al. Finding Achilles' Heel: Adversarial Attack on Multi-modal Action Recognition , 2020, ACM Multimedia.
[87] Krystian Mikolajczyk,et al. Higher-Order Occurrence Pooling for Bags-of-Words: Visual Concept Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[88] Cordelia Schmid,et al. Convolutional Kernel Networks , 2014, NIPS.
[89] Rasmus Pagh,et al. Fast and scalable polynomial kernels via explicit feature maps , 2013, KDD.
[90] Basura Fernando,et al. Learning End-to-end Video Classification with Rank-Pooling , 2016, ICML.
[91] Krystian Mikolajczyk,et al. Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection , 2013, Comput. Vis. Image Underst..
[92] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[93] Lei Wang,et al. A Comparative Review of Recent Kinect-Based Action Recognition Algorithms , 2019, IEEE Transactions on Image Processing.
[94] Du Q. Huynh,et al. Hallucinating IDT Descriptors and I3D Optical Flow Features for Action Recognition With CNNs , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[95] Richard P. Wildes,et al. Spatiotemporal Residual Networks for Video Action Recognition , 2016, NIPS.
[96] Jing Zhang,et al. Few-Shot Learning via Saliency-Guided Hallucination of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[97] Mubarak Shah,et al. A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.
[98] Berthold K. P. Horn,et al. Determining Optical Flow , 1981, Other Conferences.
[99] Cordelia Schmid,et al. AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[100] Cordelia Schmid,et al. Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.
[101] Tony Jebara,et al. Probability Product Kernels , 2004, J. Mach. Learn. Res..