High-level event recognition in unconstrained videos
暂无分享,去创建一个
Mubarak Shah | Shih-Fu Chang | Yu-Gang Jiang | Subhabrata Bhattacharya | Shih-Fu Chang | M. Shah | Yu-Gang Jiang | Subhabrata Bhattacharya
[1] Emmon W. Bach,et al. Universals in Linguistic Theory , 1970 .
[2] Takeo Kanade,et al. An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.
[3] James F. Allen. Maintaining knowledge about temporal intervals , 1983, CACM.
[4] C. Atkeson,et al. Kinematic features of unrestrained vertical arm movements , 1985, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[5] Junji Yamato,et al. Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[6] R. Patterson,et al. Complex Sounds and Auditory Images , 1992 .
[7] Ivan A. Sag,et al. Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.
[8] Thad Starner,et al. Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .
[9] B. S. Manjunath,et al. Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..
[10] Wenjun Zeng,et al. Integrated image and speech analysis for content-based video indexing , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.
[11] A F Bobick,et al. Movement, activity and action: the role of knowledge in the perception of motion. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[12] Alvin F. Martin,et al. The DET curve in assessment of detection task performance , 1997, EUROSPEECH.
[13] Hiroshi Hamada,et al. Video Handling with Music and Speech Detection , 1998, IEEE Multim..
[14] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[15] Aaron F. Bobick,et al. Recognition of Visual Activities and Interactions by Stochastic Parsing , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[16] Michele Banko,et al. Headline Generation Based on Statistical Translation , 2000, ACL.
[17] C. Granger. Investigating causal relations by econometric models and cross-spectral methods , 1969 .
[18] Aaron F. Bobick,et al. Recognizing Planned, Multiperson Action , 2001, Comput. Vis. Image Underst..
[19] Irfan Essa,et al. Recognizing Multitasked Activities using Stochastic Context-Free Grammar , 2001 .
[20] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[21] Ioannis Pitas,et al. Content-based video parsing and indexing based on audio-visual interaction , 2001, IEEE Trans. Circuits Syst. Video Technol..
[22] Ralf Herbrich,et al. Learning Kernel Classifiers: Theory and Algorithms , 2001 .
[23] Shih-Fu Chang,et al. Event detection in baseball video using superimposed caption recognition , 2002, MULTIMEDIA '02.
[24] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[25] Matti Pietikäinen,et al. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[26] Nebojsa Jojic,et al. A Graphical Model for Audiovisual Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..
[27] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[28] Joemon M. Jose,et al. Audio-Based Event Detection for Sports Video , 2003, CIVR.
[29] Mohan S. Kankanhalli,et al. Creating audio keywords for event detection in soccer video , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).
[30] Ivan Laptev,et al. On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[31] Yaser Sheikh,et al. CASEE: A Hierarchical Event Representation for the Analysis of Videos , 2004, AAAI.
[32] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .
[33] R. Sukthankar,et al. PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[34] Yan Ke,et al. PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.
[35] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.
[36] Cordelia Schmid,et al. Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.
[37] Jiri Matas,et al. Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..
[38] Larry S. Davis,et al. Representation and Recognition of Events in Surveillance Video Using Petri Nets , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.
[39] Shih-Fu Chang,et al. Structure analysis of soccer video with domain knowledge and hidden Markov models , 2004, Pattern Recognit. Lett..
[40] Tony Lindeberg,et al. Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.
[41] B. Caputo,et al. Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..
[42] Leonidas J. Guibas,et al. The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.
[43] Kunio Fukunaga,et al. Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions , 2002, International Journal of Computer Vision.
[44] Ronen Basri,et al. Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[45] Cordelia Schmid,et al. A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.
[46] Cordelia Schmid,et al. A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[47] Brendan J. Frey,et al. A comparison of algorithms for inference and learning in probabilistic graphical models , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[48] Noel E. O'Connor,et al. Event detection in field sports video using audio-visual features and a support vector Machine , 2005, IEEE Transactions on Circuits and Systems for Video Technology.
[49] Ramakant Nevatia,et al. VERL: An Ontology Framework for Representing and Annotating Video Events , 2005, IEEE Multim..
[50] Serge J. Belongie,et al. Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.
[51] Daniel P. W. Ellis,et al. Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.
[52] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[53] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .
[54] Paul Over,et al. Evaluation campaigns and TRECVid , 2006, MIR '06.
[55] Winston H. Hsu,et al. Brief Descriptions of Visual Features for Baseline TRECVID Concept Detectors , 2006 .
[56] Vesa T. Peltonen,et al. Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[57] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.
[58] Frédéric Jurie,et al. Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.
[59] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[60] John R. Smith,et al. Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.
[61] Jake K. Aggarwal,et al. Recognition of Composite Human Activities through Context-Free Grammar Based Representation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[62] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[63] Rama Chellappa,et al. Attribute Grammar-Based Event Recognition and Anomaly Detection , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).
[64] Chung-Lin Huang,et al. Semantic analysis of soccer video using dynamic Bayesian network , 2006, IEEE Transactions on Multimedia.
[65] Cordelia Schmid,et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).
[66] Rémi Ronfard,et al. Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..
[67] Antonio Torralba,et al. LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.
[68] François Pachet,et al. The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music. , 2007, The Journal of the Acoustical Society of America.
[69] Liang Wang,et al. Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[70] Jiebo Luo,et al. Kodak consumer video benchmark data set : concept definition and annotation * * , 2008 .
[71] Manuela M. Veloso,et al. Conditional random fields for activity recognition , 2007, AAMAS '07.
[72] Eli Shechtman,et al. Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[73] Chong-Wah Ngo,et al. Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.
[74] Christopher I. Connolly. Learning to Recognize Complex Actions Using Conditional Random Fields , 2007, ISVC.
[75] Mubarak Shah,et al. A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.
[76] Richard I. Hartley,et al. Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[77] Cordelia Schmid,et al. Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[78] Luc Van Gool,et al. An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.
[79] Mubarak Shah,et al. Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[80] Chong-Wah Ngo,et al. Columbia University/VIREO-CityU/IRIT TRECVID2008 High-Level Feature Extraction and Interactive Video Search , 2008, TRECVID.
[81] Frédéric Jurie,et al. Randomized Clustering Forests for Image Classification , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[82] Mubarak Shah,et al. Learning human actions via information maximization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[83] R. Nevatia,et al. Online, Real-time Tracking and Recognition of Human Actions , 2008, 2008 IEEE Workshop on Motion and video Computing.
[84] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.
[85] Dong Xu,et al. Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[86] Krystian Mikolajczyk,et al. Feature Tracking and Motion Compensation for Action Recognition , 2008, BMVC.
[87] Eli Shechtman,et al. In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[88] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[89] Antonio Torralba,et al. Spectral Hashing , 2008, NIPS.
[90] Diane J. Cook,et al. Automatic Video Classification: A Survey of the Literature , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[91] Zicheng Liu,et al. Expandable Data-Driven Graphical Modeling of Human Actions Based on Salient Postures , 2008, IEEE Transactions on Circuits and Systems for Video Technology.
[92] Lie Lu,et al. Audio Keywords Discovery for Text-Like Audio Content Analysis and Retrieval , 2008, IEEE Transactions on Multimedia.
[93] Jiebo Luo,et al. Utilizing semantic word similarity measures for video retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[94] Subhransu Maji,et al. Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[95] Michael Isard,et al. Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[96] Changsheng Xu,et al. A Novel Framework for Semantic Annotation and Personalized Retrieval of Sports Video , 2008, IEEE Transactions on Multimedia.
[97] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..
[98] Roberto Cipolla,et al. Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[99] Chong-Wah Ngo,et al. Video event detection using motion relativity and visual relatedness , 2008, ACM Multimedia.
[100] Larry S. Davis,et al. Event Modeling and Recognition Using Markov Logic Networks , 2008, ECCV.
[101] Rama Chellappa,et al. Machine Recognition of Human Activities: A Survey , 2008, IEEE Transactions on Circuits and Systems for Video Technology.
[102] Jintao Li,et al. Hierarchical spatio-temporal context modeling for action recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[103] Christopher Joseph Pal,et al. Activity recognition using the velocity histories of tracked keypoints , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[104] Yongdong Zhang,et al. VideoMap: an interactive video retrieval system of MCG-ICT-CAS , 2009, CIVR '09.
[105] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[106] Ehud Rivlin,et al. Understanding Video Events: A Survey of Methods for Automatic Interpretation of Semantic Occurrences in Video , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[107] Rong Yan,et al. Large-scale multimedia semantic concept modeling using robust subspace bagging and MapReduce , 2009, LS-MMRM '09.
[108] Yannis Avrithis,et al. Dense saliency-based spatiotemporal feature points for action recognition , 2009, CVPR.
[109] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .
[110] S. Kollias,et al. Dense saliency-based spatiotemporal feature points for action recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[111] Marcel Worring,et al. Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..
[112] Cordelia Schmid,et al. Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.
[113] Jiebo Luo,et al. Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[114] Yang Wang,et al. Max-margin hidden conditional random fields for human action recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[115] Jean Ponce,et al. Automatic annotation of human actions in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[116] Andrew Zisserman,et al. Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[117] Antonio Torralba,et al. LabelMe video: Building a video database with human annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[118] Shih-Fu Chang,et al. Short-term audio-visual atoms for generic video concept classification , 2009, ACM Multimedia.
[119] Greg Mori,et al. Max-margin hidden conditional random fields for human action recognition , 2009, CVPR.
[120] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[121] Yihong Gong,et al. Action detection in complex scenes with spatial and temporal ambiguities , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[122] Liang Lin,et al. I2T: Image Parsing to Text Description , 2010, Proceedings of the IEEE.
[123] Arnold W. M. Smeulders,et al. Real-Time Visual Concept Classification , 2010, IEEE Transactions on Multimedia.
[124] Luc Van Gool,et al. Hough Transform and 3D SURF for Robust Three Dimensional Classification , 2010, ECCV.
[125] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[126] David Elliott,et al. In the Wild , 2010 .
[127] Tinne Tuytelaars,et al. Dense interest points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[128] Chong-Wah Ngo,et al. Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.
[129] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.
[130] Alberto Del Bimbo,et al. Event detection and recognition for semantic annotation of video , 2010, Multimedia Tools and Applications.
[131] Martial Hebert,et al. Modeling the Temporal Extent of Actions , 2010, ECCV.
[132] Yann LeCun,et al. Convolutional Learning of Spatio-temporal Features , 2010, ECCV.
[133] Junsong Yuan,et al. Middle-Level Representation for Human Activities Recognition: The Role of Spatio-Temporal Relationships , 2010, ECCV Workshops.
[134] Gang Hua,et al. IBM Research TRECVID-2010 Video Copy Detection and Multimedia Event Detection System , 2010, TRECVID.
[135] Samy Bengio,et al. Sound Retrieval and Ranking Using Sparse Auditory Representations , 2010, Neural Computation.
[136] Mubarak Shah,et al. Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[137] Mubarak Shah,et al. Columbia-UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching , 2010, TRECVID.
[138] Cor J. Veenman,et al. Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[139] Christopher Joseph Pal,et al. YouTube Scale, Large Vocabulary Video Annotation , 2010, Video Search and Mining.
[140] Ivor W. Tsang,et al. Visual Event Recognition in Videos by Learning from Web Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[141] Daniel P. W. Ellis,et al. Audio-Based Semantic Concept Classification for Consumer Video , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[142] Tae-Kyun Kim,et al. Real-time Action Recognition by Spatiotemporal Semantic and Structural Forests , 2010, BMVC.
[143] Jimmy J. Lin,et al. Web-scale computer vision using MapReduce for multimedia data mining , 2010, MDMKDD '10.
[144] Mario Cannataro,et al. Protein-to-protein interactions: Technologies, databases, and algorithms , 2010, CSUR.
[145] Shih-Fu Chang,et al. Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[146] Yansong Feng,et al. How Many Words Is a Picture Worth? Automatic Caption Generation for News Images , 2010, ACL.
[147] Ronald Poppe,et al. A survey on vision-based human action recognition , 2010, Image Vis. Comput..
[148] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[149] Stefano Soatto,et al. Tracklet Descriptors for Action Modeling and Video Analysis , 2010, ECCV.
[150] Andrew W. Fitzgibbon,et al. Efficient Object Category Recognition Using Classemes , 2010, ECCV.
[151] Dong Liu,et al. BBN VISER TRECVID 2011 Multimedia Event Detection System , 2011, TRECVID.
[152] Alexander C. Loui,et al. Audio-visual grouplet: temporal audio-visual interactions for general video concept classification , 2011, ACM Multimedia.
[153] Florian Metze,et al. Informedia @ TRECVID 2011 , 2011 .
[154] Daniel P. W. Ellis,et al. Soundtrack classification by transient events , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[155] Chong-Wah Ngo,et al. Towards textually describing complex video contents with audio-visual concept classifiers , 2011, ACM Multimedia.
[156] Hsin-Min Wang,et al. Automatic annotation of Web videos , 2011, 2011 IEEE International Conference on Multimedia and Expo.
[157] Cordelia Schmid,et al. Action recognition by dense trajectories , 2011, CVPR 2011.
[158] Quoc V. Le,et al. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.
[159] Silvio Savarese,et al. Recognizing human actions by attributes , 2011, CVPR 2011.
[160] Koichi Shinoda,et al. TokyoTech+Canon at TRECVID 2011 , 2011, TRECVID.
[161] Maja Pantic,et al. Spatiotemporal Localization and Categorization of Human Actions in Unsegmented Image Sequences , 2011, IEEE Transactions on Image Processing.
[162] Thomas Serre,et al. HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.
[163] Vicente Ordonez,et al. Im2Text: Describing Images Using 1 Million Captioned Photographs , 2011, NIPS.
[164] James Allan,et al. Team SRI-Sarnoff's AURORA System @ TRECVID 2011 , 2012, TRECVID.
[165] Shih-Fu Chang,et al. Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.
[166] Qiang Ji,et al. Efficient Structure Learning of Bayesian Networks using Constraints , 2011, J. Mach. Learn. Res..
[167] Mubarak Shah,et al. Action recognition in videos acquired by a moving camera using motion decomposition of Lagrangian particle trajectories , 2011, 2011 International Conference on Computer Vision.
[168] Benjamin Z. Yao,et al. Unsupervised learning of event AND-OR grammar and semantics from video , 2011, 2011 International Conference on Computer Vision.
[169] Yu-Gang Jiang,et al. SUPER: towards real-time event recognition in internet videos , 2012, ICMR.
[170] Sven J. Dickinson,et al. Large-Scale Automatic Labeling of Video Events with Verbs Based on Event-Participant Interaction , 2012, ArXiv.
[171] Kilian Q. Weinberger,et al. Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.
[172] Dong Liu,et al. Robust late fusion with rank minimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[173] Chong-Wah Ngo,et al. Trajectory-Based Modeling of Human Actions with Motion Reference Points , 2012, ECCV.
[174] Dong Liu,et al. Joint audio-visual bi-modal codewords for video event detection , 2012, ICMR.