论文信息 - THU and ICRC at TRECVID 2007

Abstract:Shot boundary detection The shot boundary detection system in 2007 is basically the same as that of last year. We make three major modifications in the system of this year. First, CUT detector and GT detector use block based RGB color histogram with the different parameters instead of the same ones. Secondly, we add a motion detection module to the GT detector to remove the false alarms caused by camera motion or large object movements. Finally, we add a post-processing module based on SIFT feature after both CUT and GT detector. The evaluation results show that all these modifications bring performance improvements to the system. The brief introduction to each run is shown in the following table: Run_id Description Thu01 Baseline system: RGB4_48 for CUT and GT detector, no motion detector, no sift post-processing, only using development set of 2005 as training set Thu02 Same algorithm as thu01, but with RGB16_48 for CUT detector, RGB4_48 for GT detector Thu03 Same algorithm as thu02, but with SIFT post-processing for CUT Thu04 Same algorithm as thu03, but with Motion detector for GT Thu05 Same algorithm as thu04, but with SIFT post-processing for GT Thu06 Same algorithm as thu05, but no SIFT processing for CUT Thu09 Same algorithm as thu05, but with different parameters thu11 Same algorithm as thu05, but with different parameters Thu13 Same algorithm as thu05, but with different parameters Thu14 Same algorithm and parameters as thu05, but trained with all the development data from 2003-2006 High-level feature extraction We try a novel approach, Multi-Label Multi-Feature learning (MLMF learning) to learn a joint-concept distribution on the regional level as an intermediate representation. Besides, we improve our Video diver indexing system by designing new features, comparing learning algorithms and exploring novel fusion algorithms. Based on these efforts in improving feature, learning and fusion algorithms, we achieve top results in HFE this year.

参考文献

[1] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2] Josef Kittler,et al. Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[3] Dong Wang,et al. Video search in concept subspace: a text-like paradigm , 2007, CIVR '07.

[4] Bo Zhang,et al. Gradual transition detection with conditional random fields , 2007, ACM Multimedia.

[5] Tai Sing Lee,et al. Image Representation Using 2D Gabor Wavelets , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[6] Lu Liu,et al. Video Histogram: A Novel Video Signature for Efficient Web Video Duplicate Detection , 2007, MMM.

[7] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[8] Sunil Arya,et al. An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[9] Matti Pietikäinen,et al. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10] Neill W Campbell,et al. Proceedings of the TRECVID 2006 Workshop , 2006 .

[11] Tao Wang,et al. One step beyond histograms: Image representation using Markov stationary features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Sheng Tang,et al. TRECVID 2007 High-Level Feature Extraction By MCG-ICT-CAS , 2007, TRECVID.

[13] Sang Wook Lee,et al. Shot boundary detection using scale invariant feature matching , 2006, Electronic Imaging.

[14] Stephen Kwek,et al. Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[15] Bo Zhang,et al. Probabilistic model supported rank aggregation for the semantic concept detection in video , 2007, CIVR '07.

[16] Sid-Ahmed Berrani,et al. A probabilistic framework for fusing frame-based searches within a video copy detection system , 2008, CIVR '08.

[17] Klaus-Robert Müller,et al. Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[18] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.

[19] Dong Wang,et al. Video diver: generic video indexing with diverse features , 2007, MIR '07.

[20] Cordelia Schmid,et al. A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21] Anil K. Jain,et al. Texture classification and segmentation using multiresolution simultaneous autoregressive models , 1992, Pattern Recognit..

[22] Dong Wang,et al. SemanGist: A Local Semantic Image Representation , 2008, PCM.

[23] David C. Gibbon,et al. AT&T Research at TRECVID 2006 , 2006, TRECVID.

[24] Jitendra Malik,et al. Geometric blur for template matching , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[25] Luc Van Gool,et al. SURF: Speeded Up Robust Features , 2006, ECCV.

[26] Trevor Darrell,et al. The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27] Bo Zhang,et al. A Formal Study of Shot Boundary Detection , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[28] Manuel Blum,et al. Peekaboom: a game for locating objects in images , 2006, CHI.

[29] Antonio Torralba,et al. Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[30] Dong Wang,et al. The feature and spatial covariant kernel: adding implicit spatial constraints to histogram , 2007, CIVR '07.

[31] Yuan Li,et al. Vector boosting for rotation invariant multi-view face detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[32] Trevor Darrell,et al. Pyramid Match Kernels: Discriminative Classification with Sets of Image Features (version 2) , 2006 .

[33] T. Poggio,et al. Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[34] Dong Wang,et al. Relay Boost Fusion for Learning Rare Concepts in Multimedia , 2006, CIVR.

[35] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

引用

Exploiting semantics on external resources to gather visual examples for video retrieval

International Journal of Multimedia Information Retrieval

2012

Video copy detection using multiple visual cues and MPEG-7 descriptors

J. Vis. Commun. Image Represent.

2010

Probabilistic optimized ranking for multimedia semantic concept detection via RVM

CIVR '08

2008

Walsh–Hadamard Transform Kernel-Based Feature Vector for Shot Boundary Detection

IEEE Transactions on Image Processing

2014

vdFP - Video fingerprinting technologies for media and security applications - D1 : Report on existing technologies

2010

One step beyond histograms: Image representation using Markov stationary features

2008 IEEE Conference on Computer Vision and Pattern Recognition

2008

Performance evaluation of early and late fusion methods for generic semantics indexing

Pattern Analysis and Applications

2013

BUPT-MCPRL at TRECVID 2009

TRECVID

2009

Unsupervised Detection of Gradual Video Shot Changes with Motion-Based False Alarm Removal

ACIVS

2009

Video annotation based on adaptive annular spatial partition scheme

IEICE Electron. Express

2010

Video shot boundary detection using multiscale geometric analysis of nsct and least squares support vector machine

Multimedia Tools and Applications

2018

SHOT BOUNDARY DETECTION USING HILBERT TRANSFORM BASED FEATURES

2015

Parallelization and optimization of a CBVIR system on multi-core architectures

2009 IEEE International Symposium on Parallel & Distributed Processing

2009

Modeling Objects with Local Descriptors of Biologically Motivated Selective Attention

2008 Fourth International Conference on Natural Computation

2008

Learning structured concept-segments for interactive video retrieval

CIVR '08

2008

Ensemble approach based on conditional random field for multi-label image and video annotation

ACM Multimedia

2011

Markov chain local binary pattern and its application to video concept detection

2008 15th IEEE International Conference on Image Processing

2008

THU and ICRC at TRECVID 2007

Exploiting semantics on external resources to gather visual examples for video retrieval

Video copy detection using multiple visual cues and MPEG-7 descriptors

Probabilistic optimized ranking for multimedia semantic concept detection via RVM

Walsh–Hadamard Transform Kernel-Based Feature Vector for Shot Boundary Detection

vdFP - Video fingerprinting technologies for media and security applications - D1 : Report on existing technologies

One step beyond histograms: Image representation using Markov stationary features

Concept detection and keyframe extraction using a visual thesaurus

Multi-Layer Multi-Instance Learning for Video Concept Detection

Shot-change detection based on multifractal analysis

Performance evaluation of early and late fusion methods for generic semantics indexing

BUPT-MCPRL at TRECVID 2009

Unsupervised Detection of Gradual Video Shot Changes with Motion-Based False Alarm Removal

Video annotation based on adaptive annular spatial partition scheme

Video shot boundary detection using multiscale geometric analysis of nsct and least squares support vector machine

SHOT BOUNDARY DETECTION USING HILBERT TRANSFORM BASED FEATURES

Parallelization and optimization of a CBVIR system on multi-core architectures

Modeling Objects with Local Descriptors of Biologically Motivated Selective Attention

Learning structured concept-segments for interactive video retrieval

Ensemble approach based on conditional random field for multi-label image and video annotation

Markov chain local binary pattern and its application to video concept detection