Salient Object Detection on Large-Scale Video Data

Recently more and more researches focus on the concept extraction from unstructured video data. To bridge the semantic gap between the low-level features and the high-level video concepts, a mid-level understanding of the video contents, i.e., salient object is detected based on the techniques of image segmentation and machine learning. Specifically, 21 salient object detectors are developed and tested on TRECVID 2005 development video corpus. In addition, a boosting method is proposed to select the most representative features to achieve a higher performance than only using single modality, and lower complexity than taking all features into account.

[1]  Zhongfei Zhang,et al.  Image Database Classification based on Concept Vector Model , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[2]  Rong Yan,et al.  A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Zhongfei Zhang,et al.  Semi-supervised learning based object detection in aerial imagery , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  John R. Smith,et al.  A generalized multiple instance learning algorithm for large scale modeling of multimedia semantics , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[5]  Alberto Del Bimbo,et al.  Automatic interpretation of soccer video for highlights extraction and annotation , 2003, SAC '03.

[6]  Shih-Fu Chang,et al.  Pattern Mining in Visual Concept Streams , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[7]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Alberto Del Bimbo,et al.  Automatic extraction and annotation of soccer video highlights , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[10]  J.R. Smith,et al.  Learning visual models of semantic concepts , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[11]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[12]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Shahram Ebadollahi,et al.  Visual Event Detection using Multi-Dimensional Concept Dynamics , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[15]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[16]  John R. Smith,et al.  Multi-granular detection of regional semantic concepts , 2004, ICME.

[17]  Zhongfei Zhang,et al.  Object detection in aerial imagery based on enhanced semi-supervised learning , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  J. Fan,et al.  Semantic video classification and feature subset selection under context and concept uncertainty , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[19]  Rong Yan,et al.  Mining Relationship Between Video Concepts using Probabilistic Graphical Models , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[20]  Ching-Yung Lin,et al.  Multi-granular detection of regional semantic concepts [video annotation] , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[21]  James Ze Wang,et al.  Toward bridging the annotation-retrieval gap in image search by a generative modeling approach , 2006, MM '06.

[22]  Jianping Fan,et al.  Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing , 2004, IEEE Transactions on Image Processing.