A Novel Video Retrieval Method Based on Web Community Extraction Using Features of Video Materials

A novel video retrieval method based on Web community extraction using audio and visual features and textual features of video materials is proposed in this paper. In this proposed method, canonical correlation analysis is applied to these three features calculated from video materials and their Web pages, and transformation of each feature into the same variate space is possible. The transformed variates are based on the relationships between visual, audio and textual features of video materials, and the similarity between video materials in the same feature space for each feature can be calculated. Next, the proposed method introduces the obtained similarities of video materials into the link relationship between their Web pages. Furthermore, by performing link analysis of the obtained weighted link relationship, this approach extracts Web communities including similar topics and provides the degree of attribution of video materials in each Web community for each feature. Therefore, by calculating similarities of the degrees of attribution between the Web communities extracted from the three kinds of features, the desired ones are automatically selected. Consequently, by monitoring the degrees of attribution of the obtained Web communities, the proposed method can perform effective video retrieval. Some experimental results obtained by applying the proposed method to video materials obtained from actual Web pages are shown to verify the effectiveness of the proposed method.

[1]  Dong Wang,et al.  Video search in concept subspace: a text-like paradigm , 2007, CIVR '07.

[2]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[3]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[4]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Akio Nagasaka,et al.  Automatic Video Indexing and Full-Video Search for Object Appearances , 1991, VDB.

[6]  C.-C. Jay Kuo,et al.  Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[7]  James Z. Wang SIMPLIcity: a region-based retrieval system for picture libraries and biomedical image databases , 2000, MM 2000.

[8]  Yueting Zhuang,et al.  Cross-modal correlation learning for clustering on image-audio dataset , 2007, ACM Multimedia.

[9]  Paul Horst,et al.  Relations amongm sets of measures , 1961 .

[10]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[11]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[12]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[13]  Yasuhiro Suzuki,et al.  Automatically collecting, monitoring, and mining japanese weblogs , 2004, WWW Alt. '04.

[14]  James Ze Wang,et al.  IRM: integrated region matching for image retrieval , 2000, ACM Multimedia.

[15]  Tsuhan Chen,et al.  Audio Feature Extraction and Analysis for Scene Segmentation and Classification , 1998, J. VLSI Signal Process..

[16]  Miki Haseyama,et al.  Audio-Based Shot Classification for Audiovisual Indexing Using PCA, MGD and Fuzzy Algorithm , 2007, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[17]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, CVPR 2004.

[18]  Allan Aasbjerg Nielsen,et al.  Multiset canonical correlations analysis and multispectral, truly multitemporal remote sensing data , 2002, IEEE Trans. Image Process..

[19]  Shih-Fu Chang,et al.  Semantic Concept Classification by Joint Semi-supervised Learning of Feature Subspaces and Support Vector Machines , 2008, ECCV.

[20]  J. Astola,et al.  Vector median filters , 1990, Proc. IEEE.