Web video topic discovery and tracking via bipartite graph reinforcement model

Automatic topic discovery and tracking on web-shared videos can greatly benefit both web service providers and end users. Most of current solutions of topic detection and tracking were done on news and cannot be directly applied on web videos, because the semantic information of web videos is much less than that of news videos. In this paper, we propose a bipartite graph model to address this issue. The bipartite graph represents the correlation between web videos and their keywords, and automatic topic discovery is achieved through two steps - coarse topic filtering and fine topic re-ranking. First, a weight-updating co-clustering algorithm is employed to filter out topic candidates at a coarse level. Then the videos on each topic are re-ranked by analyzing the link structures of the corresponding bipartite graph. After the topics are discovered, the interesting ones can also be tracked over a period of time using the same bipartite graph model. The key is to propagate the relevant scores and keywords from the videos of interests to other relevant ones through the bipartite graph links. Experimental results on real web videos from YouKu, a YouTube counterpart in China, demonstrate the effectiveness of the proposed methods. We report very promising results.

[1]  Shin'ichi Satoh,et al.  Topic Threading for Structuring a Large-Scale News Video Archive , 2004, CIVR.

[2]  James Allan,et al.  Relevance models for topic detection and tracking , 2002 .

[3]  Shih-Fu Chang,et al.  Video search reranking through random walk over document-level context graph , 2007, ACM Multimedia.

[4]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[5]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[6]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[7]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[8]  Tim Leek,et al.  Probabilistic approaches to topic detection and tracking , 2002 .

[9]  Mubarak Shah,et al.  Tracking news stories across different sources , 2005, MULTIMEDIA '05.

[10]  John R. Kender,et al.  Visual concepts for news story tracking: analyzing and exploiting the NIST TRESVID video annotation experiment , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Shih-Fu Chang,et al.  Topic Tracking Across Broadcast News Videos with Visual Duplicates and Semantic Concepts , 2006, 2006 International Conference on Image Processing.

[12]  Douglas W. Oard,et al.  Adaptive vector space text filtering for monolingual and cross-language application , 1996 .

[13]  Amy Nicole Langville,et al.  A Survey of Eigenvector Methods for Web Information Retrieval , 2005, SIAM Rev..

[14]  Yiming Yang,et al.  Learning approaches for detecting and tracking news events , 1999, IEEE Intell. Syst..

[15]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[16]  Victor Lavrenko,et al.  Language-specific models in multilingual topic tracking , 2004, SIGIR '04.

[17]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.