Latent semantic analysis for an effective region-based video shot retrieval system

We present a complete and efficient framework for video shot indexing and retrieval. Video shots are described by their key-frame, themselves described by their regions. Region-based approaches suffer from the complexity of segmentation and comparison tasks. A compact region-based shot representation is usually obtained thanks to vector-quantization method. We thus introduce LSA to reduce the noise inherent to the segmentation and the quantization processes. Then to better capture the content of video shots, we propose two original methods. The first takes advantage of a multi-scale segmentation of frames while the second uses multiple frames to represent a shot. Both approaches require more computation time during the pre-processing but not for indexing and comparison tasks. Indeed the extra information is included in the original signatures of shots. Finally we introduce a relevance feedback loop to optimize the search and propose a new method to optimize the effect of LSA. In the experimental section, we make an evaluation of latent semantic analysis and proposed approaches on two problems, namely object retrieval and semantic content estimation

[1]  Mikko Kurimo Indexing Audio Documents by using Latent Semantic Analysis and SOM , 1999 .

[2]  Ching-Yung Lin,et al.  Video Collaborative Annotation Forum: Establishing Ground-Truth Labels on Large Multimedia Datasets , 2003, TRECVID.

[3]  Majid Mirmehdi,et al.  Perceptual Image Indexing and Retrieval , 2002, J. Vis. Commun. Image Represent..

[4]  Pedro F. Felzenszwalb,et al.  Efficiently computing a good segmentation , 1998 .

[5]  Fabrice Souvannavong,et al.  Video content modeling with latent semantic analysis , 2003 .

[6]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[7]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[8]  David Salesin,et al.  Fast multiresolution image querying , 1995, SIGGRAPH.

[9]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[10]  Joo-Hwee Lim Learning visual keywords for content-based retrieval , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[11]  Fabrice Souvannavong,et al.  Latent semantic indexing for semantic content detection of video shots , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[12]  Edoardo Ardizzone,et al.  Automatic Video Database Indexing and Retrieval , 2004, Multimedia Tools and Applications.

[13]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[14]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[15]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[16]  Murat Kunt,et al.  Spatiotemporal Segmentation Based on Region Merging , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  P. F. Felzenzwalb Efficiently computing a good segmentation , 1998 .

[18]  William I. Grosky,et al.  From features to semantics: some preliminary results , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[19]  Daniel P. Huttenlocher,et al.  Image segmentation using local variation , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[20]  Takeo Kanade,et al.  Intelligent Access to Digital Video: Informedia Project , 1996, Computer.

[21]  Fabrice Souvannavong,et al.  Latent Semantic Indexing for Video Content Modeling and Analysis , 2003, TRECVID.

[22]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[23]  Bo Zhang,et al.  An effective region-based image retrieval framework , 2002, MULTIMEDIA '02.

[24]  Daniel DeMenthon,et al.  SPATIO-TEMPORAL SEGMENTATION OF VIDEO BY HIERARCHICAL MEAN SHIFT ANALYSIS , 2002 .

[25]  Jitendra Malik,et al.  Blobworld: A System for Region-Based Image Indexing and Retrieval , 1999, VISUAL.