Specific video identification via joint learning of latent semantic concept, scene and temporal structure
暂无分享,去创建一个
Fei Su | Zhicheng Zhao | Yi-Fan Song | Fei Su | Zhicheng Zhao | Yi-Fan Song
[1] Pong C. Yuen,et al. Reduced Analytic Dependency Modeling: Robust Fusion for Visual Recognition , 2014, International Journal of Computer Vision.
[2] Yoshua Bengio,et al. Credit Assignment through Time: Alternatives to Backpropagation , 1993, NIPS.
[3] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.
[4] Mubarak Shah,et al. High-level event recognition in unconstrained videos , 2013, International Journal of Multimedia Information Retrieval.
[5] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[6] Florent Perronnin,et al. Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[7] Nuno Vasconcelos,et al. Dynamic Pooling for Complex Event Recognition , 2013, 2013 IEEE International Conference on Computer Vision.
[8] Dapeng Li,et al. Event Bank based multimedia representation via latent group logistic regression minimization , 2016, Neurocomputing.
[9] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[10] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[11] Gang Hua,et al. Semantic Model Vectors for Complex Video Event Recognition , 2012, IEEE Transactions on Multimedia.
[12] Georges Quénot,et al. TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.
[13] TorralbaAntonio,et al. Modeling the Shape of the Scene , 2001 .
[14] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.
[15] Yi Yang,et al. A discriminative CNN video representation for event detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Andrea Vedaldi,et al. Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.
[17] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[18] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[19] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[20] Wen Gao,et al. Video Copy-Detection and Localization with a Scalable Cascading Framework , 2013, IEEE MultiMedia.
[21] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[22] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[23] Nicu Sebe,et al. Feature Weighting via Optimal Thresholding for Video Analysis , 2013, 2013 IEEE International Conference on Computer Vision.
[24] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[25] Tao Xiang,et al. Learning Multimodal Latent Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Tieniu Tan,et al. Relevance Topic Model for Unstructured Social Group Activity Recognition , 2013, NIPS.
[27] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[28] Yi Yang,et al. E-LAMP: integration of innovative ideas for multimedia event detection , 2013, Machine Vision and Applications.
[29] Masoud Mazloom,et al. Querying for video events by semantic signatures from few examples , 2013, MM '13.
[30] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[31] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[32] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .
[33] Shiguang Shan,et al. Informedia@TrecVID 2014: MED and MER , 2014 .
[34] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..
[35] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[36] Dong Liu,et al. Sample-Specific Late Fusion for Visual Category Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[37] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[38] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[39] Fei-Fei Li,et al. Learning latent temporal structure for complex event detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[40] Geoffrey Zweig,et al. Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[41] Yi Yang,et al. Complex Event Detection using Semantic Saliency and Nearly-Isotonic SVM , 2015, ICML.
[42] Nicu Sebe,et al. Where am I in the dark: Exploring active transfer learning on the use of indoor localization based on thermal imaging , 2016, Neurocomputing.
[43] Andrew Zisserman,et al. All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[44] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[45] Dong Liu,et al. Discovering joint audio–visual codewords for video event detection , 2013, Machine Vision and Applications.
[46] Shih-Fu Chang,et al. Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.
[47] Cordelia Schmid,et al. Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.
[48] Jun Wang,et al. Exploring Inter-feature and Inter-class Relationships with Deep Neural Networks for Video Classification , 2014, ACM Multimedia.
[49] Shaogang Gong,et al. Attribute Learning for Understanding Unstructured Social Activity , 2012, ECCV.
[50] Ramakant Nevatia,et al. ISOMER: Informative Segment Observations for Multimedia Event Recounting , 2014, ICMR.