论文信息 - Building Detectors to Support Searches on Combined Semantic Concepts

Building Detectors to Support Searches on Combined Semantic Concepts

Bridging the semantic gap is one of the big challenges in multimedia information retrieval. It exists between the extraction of low-level features of a video and its conceptual contents. In order to understand the conceptual content of a video a common approach is building concept detectors. A problem of this approach is that the number of detectors is impossible to determine. This paper presents a set of 8 methods on how to combine two existing concepts into a new one, which occurs when both concepts appear at the same time. The scores for each shot of a video for the combined concept are computed from the output of the underlying detectors. The findings are evaluated on basis of the output of the 101 detectors including a comparison to the theoretical possibility to train a classifier on each combined concept. The precision gains are significant, specially for methods which also consider the chronological surrounding of a shot promising.

[1] Rick Kazman,et al. Supporting the retrieval process in multimedia information systems , 1997, Proceedings of the Thirtieth Hawaii International Conference on System Sciences.

[2] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3] James Ze Wang,et al. Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[4] Dennis Koelma,et al. The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[5] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[6] John R. Smith,et al. On the detection of semantic concepts at TRECVID , 2004, MULTIMEDIA '04.

[7] Thomas S. Huang,et al. Relevance feedback in content-based image retrieval: some recent advances , 2002, Inf. Sci..

[8] Anil K. Jain,et al. On image classification: city images vs. landscapes , 1998, Pattern Recognit..

[9] Christian Böhm,et al. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[10] Nello Cristianini,et al. Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[11] G. Nemhauser,et al. Integer Programming , 2020 .

[12] Anoop Gupta,et al. Distributed meetings: a meeting capture and broadcasting system , 2002, MULTIMEDIA '02.

[13] Paul Over,et al. TRECVID 2005 - An Overview , 2005, TRECVID.

[14] Hideyuki Tamura,et al. Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[15] Jane Hunter,et al. Dynamic Generation of Intelligent Multimedia Presentations through Semantic Inferencing , 2002, ECDL.

[16] Ricardo A. Baeza-Yates,et al. Searching in metric spaces , 2001, CSUR.

[17] John R. Smith,et al. Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[18] Daniel A. Keim,et al. Using entropy impurity for improved 3D object similarity search , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[19] Brian Christopher Smith,et al. Passive capture and structuring of lectures , 1999, MULTIMEDIA '99.

[20] Marcel Worring,et al. Multimedia event-based video indexing using time intervals , 2005, IEEE Transactions on Multimedia.

[21] Michael J. Swain,et al. Color indexing , 1991, International Journal of Computer Vision.

[22] Rong Yan,et al. Probabilistic models for combining diverse knowledge sources in multimedia retrieval , 2006 .

[23] Pavel Zezula,et al. Similarity Search - The Metric Space Approach , 2005, Advances in Database Systems.

[24] Gregory D. Abowd,et al. Automated capture, integration, and visualization of multiple media streams , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[25] Hagen Soltau,et al. The ISL Meeting Room System , 2001 .

[26] Daniel A. Keim,et al. A pivot-based index structure for combination of feature vectors , 2005, SAC '05.

[27] 编程语言. Query by Example , 2010, Encyclopedia of Database Systems.

[28] Chao Li,et al. VeXQuery: An XQuery Extension for MPEG-7 Vector-Based Feature Query , 2009, SITIS.

[29] Denis Lalanne,et al. Thematic segmentation of meetings through document/speech alignment , 2004, MULTIMEDIA '04.

[30] BENJAMIN BUSTOS,et al. Feature-based similarity search in 3D object databases , 2005, CSUR.

[31] Nicu Sebe,et al. The State of the Art in Image and Video Retrieval , 2003, CIVR.

[32] Marcel Worring,et al. Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[33] Anil K. Jain,et al. Algorithms for Clustering Data , 1988 .

[34] Hans-Jörg Schek,et al. A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[35] Denis Lalanne,et al. From searching to browsing through multimodal documents linking , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[36] Hans Burkhardt,et al. Feature Selection for Automatic Image Annotation , 2006, DAGM-Symposium.

[37] Andrei Popescu-Belis,et al. Reference Resolution over a Restricted Domain: References to Documents , 2004 .

[38] Denis Lalanne,et al. Thematic alignment of recorded speech with documents , 2003, DocEng '03.

[39] Marcel Worring,et al. The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[40] Sharad Mehrotra,et al. The hybrid tree: an index structure for high dimensional feature spaces , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[41] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[42] Sakti Pramanik,et al. The ND-Tree: A Dynamic Indexing Technique for Multidimensional Non-ordered Discrete Data Spaces , 2003, VLDB.

[43] Benjamin Bustos,et al. Dynamic similarity search in multi-metric spaces , 2006, MIR '06.