Multimedia event detection with multimodal feature fusion and temporal concept localization
暂无分享,去创建一个
A. G. Amitha Perera | Sangmin Oh | Jason J. Corso | Greg Mori | Scott McCloskey | Hossein Hajimirsadeghi | Megha Pandey | Ilseo Kim | Arash Vahdat | Kevin J. Cannons
[1] John R. Smith,et al. Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).
[2] Andrew Zisserman,et al. Efficient Additive Kernels via Explicit Feature Maps , 2012, IEEE Trans. Pattern Anal. Mach. Intell..
[3] Michael I. Jordan,et al. Modeling annotated data , 2003, SIGIR.
[4] Shuang Wu,et al. Multimodal feature fusion for robust event detection in web videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[5] Fei-Fei Li,et al. What Does Classifying More Than 10, 000 Image Categories Tell Us? , 2010, ECCV.
[6] Chin-Hui Lee,et al. Consumer-level multimedia event detection through unsupervised audio signal modeling , 2012, INTERSPEECH.
[7] Fei-Fei Li,et al. Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[8] Daniel P. W. Ellis,et al. Audio-Based Semantic Concept Classification for Consumer Video , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Anderson Rocha,et al. Robust Fusion: Extreme Value Theory for Recognition Score Normalization , 2010, ECCV.
[10] Xi Chen,et al. Text classification with kernels on the multinomial manifold , 2005, SIGIR '05.
[11] Arun Ross,et al. Score normalization in multimodal biometric systems , 2005, Pattern Recognit..
[12] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.
[13] Hagai Attias,et al. Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[14] Rich Caruana,et al. Predicting good probabilities with supervised learning , 2005, ICML.
[15] Haizhou Li,et al. An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Chin-Hui Lee,et al. Optimization of average precision with Maximal Figure-of-Merit Learning , 2011, 2011 IEEE International Workshop on Machine Learning for Signal Processing.
[18] Pong C. Yuen,et al. Linear dependency modeling for feature fusion , 2011, 2011 International Conference on Computer Vision.
[19] Andrew Zisserman,et al. Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[20] Dong Liu,et al. Robust late fusion with rank minimization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[21] Yongdong Zhang,et al. Explicit and implicit concept-based video retrieval with bipartite graph propagation model , 2010, ACM Multimedia.
[22] Alexander C. Loui,et al. Audio-visual grouplet: temporal audio-visual interactions for general video concept classification , 2011, ACM Multimedia.
[23] Rong Yan,et al. Video Retrieval Based on Semantic Concepts , 2008, Proceedings of the IEEE.
[24] Yansong Feng,et al. Topic Models for Image Annotation and Text Illustration , 2010, HLT-NAACL.
[25] Greg Mori,et al. Max-margin hidden conditional random fields for human action recognition , 2009, CVPR.
[26] Li Li,et al. A Survey on Visual Content-Based Video Indexing and Retrieval , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[27] Chong Wang,et al. Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[28] Chin-Hui Lee,et al. A MFoM learning approach to robust multiclass multi-label text categorization , 2004, ICML.
[29] Ernest Valveny,et al. Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[30] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[31] Hao Su,et al. Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.
[32] Marcel Worring,et al. The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.
[33] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[34] Chin-Hui Lee,et al. On the importance of modeling temporal information in music tag annotation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[35] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[36] Krista A. Ehinger,et al. SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[37] Wei Liu,et al. Double Fusion for Multimedia Event Detection , 2012, MMM.
[38] Cordelia Schmid,et al. TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[39] Michael I. Jordan,et al. Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.
[40] Yang Wang,et al. Kernel Latent SVM for Visual Recognition , 2012, NIPS.
[41] Subhransu Maji,et al. Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[42] Yanxi Liu,et al. Local Expert Forest of Score Fusion for Video Event Classification , 2012, ECCV.
[43] Quoc V. Le,et al. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.
[44] Shuicheng Yan,et al. Towards a universal detector by mining concepts with small semantic gaps , 2010, Expert Syst. Appl..
[45] Cordelia Schmid,et al. A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.
[46] Biing-Hwang Juang,et al. Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method , 1998, Proc. IEEE.
[47] Georges Quénot,et al. TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.
[48] Hui Cheng,et al. Evaluation of low-level features and their combinations for complex event detection in open source videos , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[49] Frank K. Soong,et al. A segment model based approach to speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[50] Vladimir Pavlovic,et al. A New Baseline for Image Annotation , 2008, ECCV.
[51] Daniel P. W. Ellis,et al. IBM Research and Columbia University TRECVID-2012 Multimedia Event Detection (MED), Multimedia Event Recounting (MER), and Semantic Indexing (SIN) Systems , 2012, TRECVID.
[52] Chin-Hui Lee,et al. Explicit Performance Metric Optimization for Fusion-Based Video Retrieval , 2012, ECCV Workshops.
[53] Jiri Matas,et al. On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[54] Alexander G. Hauptmann,et al. Leveraging high-level and low-level features for multimedia event detection , 2012, ACM Multimedia.
[55] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[56] Mubarak Shah,et al. Columbia-UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching , 2010, TRECVID.