SUPER: towards real-time event recognition in internet videos
暂无分享,去创建一个
[1] Cordelia Schmid,et al. Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.
[2] Mubarak Shah,et al. Columbia-UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching , 2010, TRECVID.
[3] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.
[4] Matti Pietikäinen,et al. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[5] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.
[6] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[7] Dong Liu,et al. BBN VISER TRECVID 2011 Multimedia Event Detection System , 2011, TRECVID.
[8] Frédéric Jurie,et al. Randomized Clustering Forests for Image Classification , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[9] Tae-Kyun Kim,et al. Real-time Action Recognition by Spatiotemporal Semantic and Structural Forests , 2010, BMVC.
[10] Ivan Laptev,et al. On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[11] Chong-Wah Ngo,et al. Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.
[12] Frédéric Jurie,et al. Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.
[13] Daniel P. W. Ellis,et al. Audio-Based Semantic Concept Classification for Consumer Video , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Subhransu Maji,et al. Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[15] Eli Shechtman,et al. Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[16] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[17] Luc Van Gool,et al. Action snippets: How many frames does human action recognition require? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[18] Shih-Fu Chang,et al. Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.
[19] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .
[20] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..
[21] Arnold W. M. Smeulders,et al. Real-Time Visual Concept Classification , 2010, IEEE Transactions on Multimedia.
[22] T. Stanford,et al. Multisensory integration: current issues from the perspective of the single neuron , 2008, Nature Reviews Neuroscience.
[23] Jiebo Luo,et al. Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[24] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .
[25] Luc Van Gool,et al. Hough Transform and 3D SURF for Robust Three Dimensional Classification , 2010, ECCV.
[26] Roberto Cipolla,et al. Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[27] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .