相关论文

Abstract:We participated in the high-level feature extraction task in TRECVID 2007. This paper describes the details of our system for the task. For feature extraction, we propose an EMD-based bag-of-feature method to exploit visual/spatial information, and utilize WordNet to expand semantic meanings of text to boost up the generalization of detectors. We also explore audio features and extract the motion cues in compressed domain for detecting concepts highly associated with audio/motion. We use Ordered Weighted Average (OWA) fusion method to combine the SVM-based multi-modal concept detection results. Experiment results show that our methods are effective.

参考文献

[1]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Jin Zhao,et al.  Video Retrieval Using High Level Features: Exploiting Query Matching and Confidence-Based Weighting , 2006, CIVR.

[3]  Ahmed K. Elmagarmid,et al.  InsightVideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval , 2005, IEEE Transactions on Multimedia.

[4]  John R. Smith,et al.  Cluster-based data modeling for semantic video search , 2007, CIVR '07.

[5]  Sheng Tang,et al.  A density-based method for adaptive LDA model selection , 2009, Neurocomputing.

[6]  J. Kacprzyk,et al.  The Ordered Weighted Averaging Operators: Theory and Applications , 1997 .

[7]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[8]  Cordelia Schmid,et al.  Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[9]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[10]  Lei Zhang,et al.  Canny edge detection enhancement by scale multiplication , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  G. Clark,et al.  Reference , 2008 .

[12]  King-Ip Lin,et al.  The ANN-tree: an index for efficient approximate nearest neighbor search , 2001, Proceedings Seventh International Conference on Database Systems for Advanced Applications. DASFAA 2001.

[13]  Rong Yan,et al.  Filling the Semantic Gap in Video Retrieval: An Exploration , 2008 .

[14]  Rong Yan,et al.  Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[15]  Gang Wang,et al.  Exploring knowledge of sub-domain in a multi-resolution bootstrapping framework for concept detection in news video , 2008, ACM Multimedia.

[16]  Trevor Darrell,et al.  Efficient image matching with distributions of local invariant features , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Jake K. Aggarwal,et al.  A hierarchical Bayesian network for event recognition of human actions and interactions , 2004, Multimedia Systems.

[18]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[19]  Meng Wang,et al.  MSRA-USTC-SJTU at TRECVID 2007: High-Level Feature Extraction and Search , 2007, TRECVID.

[20]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[21]  Milan Sonka,et al.  Image Processing, Analysis and Machine Vision , 1993, Springer US.

[22]  Boon-Lock Yeo,et al.  On the extraction of DC sequence from MPEG compressed video , 1995, Proceedings., International Conference on Image Processing.

[23]  Shih-Fu Chang,et al.  CU-VIREO 374 : Fusing Columbia 374 and VIREO 374 for Large Scale Semantic Concept Detection , 2008 .

[24]  Nikos Paragios,et al.  Background modeling and subtraction of dynamic scenes , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Li Chen,et al.  Video copy detection: a comparative study , 2007, CIVR '07.

[26]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[27]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[28]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[29]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[31]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[32]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[33]  Yongdong Zhang,et al.  Segregated feedback with performance-based adaptive sampling for interactive news video retrieval , 2007, ACM Multimedia.

[34]  Richard Bowden,et al.  Detection and Tracking of Humans by Probabilistic Body Part Assembly , 2005, BMVC.

[35]  ZhangLei,et al.  Canny Edge Detection Enhancement by Scale Multiplication , 2005 .

引用
Linguistic Patterns and Cross Modality-based Image Retrieval for Complex Queries
ICMR
2018
Ensemble Learning with LDA Topic Models for Visual Concept Detection
2012
Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification
IEEE Transactions on Multimedia
2012
Exploring large scale data for multimedia QA: an initial study
CIVR '10
2010
Large scale incremental web video categorization
WSMC '09
2009
FXPAL Interactive Search Experiments for TRECVID 2007
TRECVID
2007
Cross-Domain Concept Detection with Dictionary Coherence by Leveraging Web Images
MMM
2015
Semantic Video Annotation using Background Knowledge and Similarity-based Video Retrieval
TRECVID
2008
Group Sparse Ensemble Learning for Visual Concept Detection
PCM
2013
Sparse Ensemble Learning for Concept Detection
IEEE Transactions on Multimedia
2012
National Institute of Informatics, Japan at TRECVID 2008
TRECVID
2008
Beyond Semantic Search: What You Observe May Not Be What You Think
TRECVID
2008
Performance evaluation of early and late fusion methods for generic semantics indexing
Pattern Analysis and Applications
2013
TRECVID 2010 Known-item Search by NUS
TRECVID
2010
TRECVID 2007 High Level Feature Extraction experiments at JOANNEUM RESEARCH
TRECVID
2007
MovieBase: a movie database for event detection and behavioral analysis
WSMC '09
2009
Hierarchical BoW with segmental sparse coding for large scale image classification and retrieval
Multimedia Tools and Applications
2018
Web video retagging
Multimedia Tools and Applications
2011
MMM-TJU at TRECVID 2010
TRECVID
2010
THU and ICRC at TRECVID 2007
TRECVID
2007