CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos
暂无分享,去创建一个
Shih-Fu Chang | Zheng Shou | Alireza Zareian | Jonathan Chan | Kazuyuki Miyazawa | Shih-Fu Chang | Alireza Zareian | Zheng Shou | Jonathan Chan | K. Miyazawa
[1] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.
[2] Nicu Sebe,et al. Unsupervised Tube Extraction Using Transductive Learning and Dense Trajectories , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[3] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.
[4] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[5] Cordelia Schmid,et al. The LEAR submission at Thumos 2014 , 2014 .
[6] Hong Zhang,et al. Multi-scale Patch Aggregation (MPA) for Simultaneous Detection and Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Andrew Zisserman,et al. Convolutional Two-Stream Network Fusion for Video Action Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[9] Cees Snoek,et al. Spot On: Action Localization from Pointly-Supervised Proposals , 2016, ECCV.
[10] Jenny Benois-Pineau,et al. Fast Action Localization in Large-Scale Video Archives , 2016, IEEE Transactions on Circuits and Systems for Video Technology.
[11] Li Fei-Fei,et al. Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos , 2015, International Journal of Computer Vision.
[12] Chenliang Xu,et al. Actor-Action Semantic Segmentation with Grouping Process Models , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[14] Yi Yang,et al. DevNet: A Deep Event Network for multimedia event detection and evidence recounting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Li Fei-Fei,et al. End-to-End Learning of Action Detection from Frame Glimpses in Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Rémi Ronfard,et al. A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..
[17] Cees G. M. Snoek,et al. Objects2action: Classifying and Localizing Actions without Any Video Example , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[18] Cees Snoek,et al. APT: Action localization proposals from dense trajectories , 2015, BMVC.
[19] Jean Ponce,et al. Automatic annotation of human actions in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[20] Bernard Ghanem,et al. DAPs: Deep Action Proposals for Action Understanding , 2016, ECCV.
[21] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[22] Cordelia Schmid,et al. Temporal Localization of Actions with Actoms. , 2013, IEEE transactions on pattern analysis and machine intelligence.
[23] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[24] Bernt Schiele,et al. A database for fine grained activity detection of cooking activities , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[25] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[26] Jenny Benois-Pineau,et al. Scalable action localization with kernel-space hashing , 2015, 2015 IEEE International Conference on Image Processing (ICIP).
[27] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[28] Haroon Idrees,et al. Predicting the Where and What of Actors and Actions through Online Action Localization , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Gregory D. Hager,et al. Segmental Spatiotemporal CNNs for Fine-Grained Action Segmentation , 2016, ECCV.
[30] Gang Yu,et al. Fast action proposals for human action detection and search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Lorenzo Torresani,et al. Deep End2End Voxel2Voxel Prediction , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[32] Graham W. Taylor,et al. Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.
[33] Ramakant Nevatia,et al. Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images , 2015, ACM Multimedia.
[34] Thomas Mensink,et al. Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.
[35] J.K. Aggarwal,et al. Human activity analysis , 2011, ACM Comput. Surv..
[36] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[37] Luc Van Gool,et al. Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.
[38] Mubarak Shah,et al. What If We Do Not have Multiple Videos of the Same Action? — Video Action Localization Using Web Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Vibhav Vineet,et al. Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[40] Cees Snoek,et al. Objects2action: Classifying and Localizing Actions without Any Video Example , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[41] Cordelia Schmid,et al. Action and Event Recognition with Fisher Vectors on a Compact Feature Set , 2013, 2013 IEEE International Conference on Computer Vision.
[42] Ming Shao,et al. A Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Haroon Idrees,et al. Action Localization in Videos through Context Walk , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[44] Ali Farhadi,et al. Much Ado About Time: Exhaustive Annotation of Temporal Data , 2016, HCOMP.
[45] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[46] Juergen Gall,et al. Temporal Action Detection Using a Statistical Language Model , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47] Yi Yang,et al. A discriminative CNN video representation for event detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Lorenzo Torresani,et al. Deep End2End Voxel2Voxel Prediction , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[49] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.
[50] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[51] Patrick Bouthemy,et al. Action Localization with Tubelets from Motion , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[52] Bernard Ghanem,et al. Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Jitendra Malik,et al. Finding action tubes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Seunghoon Hong,et al. Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation , 2015, NIPS.
[55] Xiaoou Tang,et al. Action Recognition and Detection by Combining Motion and Appearance Features , 2014 .
[56] Patrick Pérez,et al. Retrieving actions in movies , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[57] Seunghoon Hong,et al. Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[58] Ali Farhadi,et al. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding , 2016, ECCV.
[59] Shih-Fu Chang,et al. Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Guosheng Lin,et al. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[61] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Graham W. Taylor,et al. Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[63] Cordelia Schmid,et al. Human Focused Action Localization in Video , 2010, ECCV Workshops.
[64] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[65] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[66] Cordelia Schmid,et al. Actom sequence models for efficient action detection , 2011, CVPR 2011.
[67] Cordelia Schmid,et al. Learning to Track for Spatio-Temporal Action Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[68] Cees Snoek,et al. What do 15,000 object categories tell us about classifying and localizing actions? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[69] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[70] Bingbing Ni,et al. Temporal Action Localization with Pyramid of Score Distribution Features , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[71] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[72] Ronald Poppe,et al. A survey on vision-based human action recognition , 2010, Image Vis. Comput..
[73] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[74] Fabio Cuzzolin,et al. Untrimmed Video Classification for Activity Detection: submission to ActivityNet Challenge , 2016, ArXiv.
[75] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.