University of Marburg at TRECVID 2005: Shot Boundary Detection and Camera Motion Estimation Results

In this paper, we summarize our results in the shot boundary task and the low-level feature task at TRECVID 2005. The low-level feature task was to retrieve the shots in which one of the following camera motion events was present: pan, tilt and zoom. An unsupervised approach to detect shot boundaries, aimed at minimizing the impact of parameter settings, is presented (4). Frame dissimilarities are measured by motion compensated pixel differences of subsequent DC-frames and histogram intersection of DC-frames for several frame distances. A feature vector consists of the dissimilarity value and its ratio to the maximum neighbor value within a sliding window. K-means clustering is used for both cut detection and gradual transition detection. For cut detection, the best sliding window size is estimated by evaluating the clustering quality of the "cuts" cluster for several window sizes. Furthermore, we investigate whether an ensemble of classifiers improves the cut detection performance. For this purpose, the unsupervised learning approach is extended by two classifiers: an Adaboost-based classifier and a Support Vector Machine (SVM). These classifiers were trained on the TRECVID 2004 shot boundary test set. To retrieve shots with camera motion events, we have modified a previously presented approach to camera motion (5). MPEG motion vectors are utilized to estimate the rotation and zoom parameters in a 3D-camera model. However, since motion vectors are optimal with respect to compression, many of them often do not model "real" motion adequately and can thus be considered as "outliers". Furthermore, we exclude motion vectors at the frame border and motion vectors in the middle of a frame since often (moving) objects of interest are captured in this frame area. Finally, the motion parameters must exceed a threshold for several frames to be considered as camera motion.

[1]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[2]  Mike Brookes,et al.  Precise real-time outlier removal from motion vector fields for 3D reconstruction , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[3]  Edward J. Delp,et al.  A fast algorithm for video parsing using MPEG compressed sequences , 1995, Proceedings., International Conference on Image Processing.

[4]  Ba Tu Truong,et al.  New enhancements to cut, fade, and dissolve detection processes in video segmentation , 2000, ACM Multimedia.

[5]  Bernd Freisleben,et al.  Frame difference normalization: an approach to reduce error rates of cut detection algorithms for MPEG videos , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[6]  Boon-Lock Yeo,et al.  Rapid scene analysis on compressed video , 1995, IEEE Trans. Circuits Syst. Video Technol..

[7]  Robert P. W. Duin,et al.  Limits on the majority vote accuracy in classifier fusion , 2003, Pattern Analysis & Applications.

[8]  Christian Petersohn Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection System , 2004, TRECVID.

[9]  Svetha Venkatesh,et al.  Qualitative estimation of camera motion parameters from video sequences , 1997, Pattern Recognition.

[10]  Bernd Freisleben,et al.  Estimation of arbitrary camera motion in MPEG videos , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[11]  Bernd Freisleben,et al.  Improving cut detection in MPEG videos by GOP-oriented frame difference normalization , 2004, ICPR 2004.

[12]  Bernd Freisleben,et al.  Video Cut Detection without Thresholds , 2004 .