Extraction of query term-related visual phrases for news video retrieval using mutual information

This paper presents an approach to query term-related visual phrases extraction using mutual information for object-based news video retrieval. As visual words are useful for object representation, unstable visual words generally appear in the frame sequence of a shot. Using the appearance frequency of the visual words in a sliding window over the query term-related stories, the appearance pattern of a visual word is adopted to characterize the visual word. Based on the appearance pattern of a visual word, the mutual information between two visual words can be estimated over all of the extracted stories. The mutual information is then used to construct a visual word relation graph. Visual phrases are then extracted by discovering the complete sub-graphs from the visual word relation graph for news video retrieval. Experiments were conducted on the MATBN news video database and the experimental results show that a good precision rate for video news retrieval can be achieved.

[1]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[2]  Xing Xie,et al.  Visual pattern weighting for near-duplicate image retrieval , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[3]  Ming Yang,et al.  Discovery of Collocation Patterns: from Visual Words to Visual Phrases , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Chung-Hsien Wu,et al.  Video News Retrieval Incorporating Relevant Terms Based on Distribution of Document Frequency , 2008, PCM.

[5]  Wen Gao,et al.  Effective and efficient object-based image retrieval using visual phrases , 2006, MM '06.

[6]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Tsuhan Chen,et al.  DISCOV: A Framework for Discovering Objects in Video , 2008, IEEE Transactions on Multimedia.

[8]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Hui Xiong,et al.  Discovering colocation patterns from spatial data sets: a general approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Hsin-Min Wang,et al.  MATBN: A Mandarin Chinese Broadcast News Corpus , 2005, Int. J. Comput. Linguistics Chin. Lang. Process..