University of Central Florida at TRECVID 2004

This year, the Computer Vision Group at University of Central Florida participated in two tasks in TRECVID 2004: High-Level Feature Extraction and Story Segmentation. For feature extraction task, we have developed the detection methods for “Madeleine Albright”, “Bill Clinton”, “Beach”, “Basketball Scored” and “People Walking/Running”. We used the adaboost technique, and has employed the speech recognition output in addition to visual cues. In story segmentation, we used a 2-phase approach. The video is initially segmented into coarse segmentation by finding the anchor persons. The coarse segmentation is then refined in the second phase by further detecting weather and sports stories and merging the semantically related stories, which is determined by the visual and text similarities.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[2]  Boon-Lock Yeo,et al.  Extracting story units from long programs for video browsing and navigation , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[3]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[4]  Amarnath Gupta,et al.  Virage video engine , 1997, Electronic Imaging.

[5]  Boon-Lock Yeo,et al.  Segmentation of Video by Clustering and Graph Analysis , 1998, Comput. Vis. Image Underst..

[6]  Alan Hanjalic,et al.  Automated high-level movie segmentation for advanced video-retrieval systems , 1999, IEEE Trans. Circuits Syst. Video Technol..

[7]  Wolfgang Effelsberg,et al.  Scene Determination Based on Video and Audio Features , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[8]  Chong-Wah Ngo,et al.  Motion-Based Video Representation for Scene Change Detection , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[9]  Svetha Venkatesh,et al.  Novel approach to determining tempo and dramatic story sections in motion pictures , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[10]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[11]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[12]  Mubarak Shah,et al.  Visual Content-Based Segmentation of Talk and Game Shows , 2002 .

[13]  Shrikanth Narayanan,et al.  Movie Content Analysis, Indexing and Skimming Via Multimodal Information , 2003 .

[14]  Mubarak Shah,et al.  Scene detection in Hollywood movies and TV shows , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Chong-Wah Ngo,et al.  Motion-Based Video Representation for Scene Change Detection , 2004, International Journal of Computer Vision.

[16]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[17]  Rainer Lienhart,et al.  Scene Determination Based on Video and Audio Features , 2004, Multimedia Tools and Applications.

[18]  Atreyi Kankanhalli,et al.  Automatic partitioning of full-motion video , 1993, Multimedia Systems.