Collaborative video reindexing via matrix factorization

Concept-based video indexing generates a matrix of scores predicting the possibilities of concepts occurring in video shots. Based on the idea of collaborative filtering, this article presents unsupervised methods to refine the initial scores generated by concept classifiers by taking into account the concept-to-concept correlation and shot-to-shot similarity embedded within the score matrix. Given a noisy matrix, we refine the inaccurate scores via matrix factorization. This method is further improved by learning multiple local models and incorporating contextual-temporal structures. Experiments on the TRECVID 2006--2008 datasets demonstrate relative performance gains ranging from 13% to 52% without using any user annotations or external knowledge resources.

[1]  Chong-Wah Ngo,et al.  Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.

[2]  WengMing-Fang,et al.  Collaborative video reindexing via matrix factorization , 2012 .

[3]  Meng Wang,et al.  Correlative multilabel video annotation with temporal kernels , 2008, TOMCCAP.

[4]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[5]  Emine Yilmaz,et al.  Estimating average precision with incomplete and imperfect judgments , 2006, CIKM '06.

[6]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[7]  Milind R. Naphade,et al.  A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..

[8]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[9]  Tao Mei,et al.  CrowdReranking: exploring multiple search engines for visual search reranking , 2009, SIGIR.

[10]  Shih-Fu Chang,et al.  CU-VIREO 374 : Fusing Columbia 374 and VIREO 374 for Large Scale Semantic Concept Detection , 2008 .

[11]  Zuowei Shen,et al.  Robust video denoising using low rank matrix completion , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[13]  Ming-Syan Chen,et al.  Association and Temporal Rule Mining for Post-Filtering of Semantic Concept Detection in Video , 2008, IEEE Transactions on Multimedia.

[14]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[15]  Thomas S. Huang,et al.  Factor graph framework for semantic video indexing , 2002, IEEE Trans. Circuits Syst. Video Technol..

[16]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[17]  John R. Smith,et al.  Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues , 2003, EURASIP J. Adv. Signal Process..

[18]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[19]  Alexander G. Hauptmann,et al.  LSCOM Lexicon Definitions and Annotations (Version 1.0) , 2006 .

[20]  Yung-Yu Chuang,et al.  Multi-cue fusion for semantic video indexing , 2008, ACM Multimedia.

[21]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[22]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[23]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[24]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[25]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[26]  Chong-Wah Ngo,et al.  Domain adaptive semantic diffusion for large scale context-based video annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Jiebo Luo,et al.  Utilizing semantic word similarity measures for video retrieval , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Newton Lee,et al.  ACM Transactions on Multimedia Computing, Communications and Applications (ACM TOMCCAP) , 2007, CIE.

[29]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[30]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[31]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[32]  Shih-Fu Chang,et al.  Context-Based Concept Fusion with Boosted Conditional Random Fields , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[33]  Yi-Hsuan Yang,et al.  Online Reranking via Ordinal Informative Concepts for Context Fusion in Concept Detection and Video Search , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Shih-Fu Chang,et al.  A reranking approach for context-based concept fusion in video indexing and retrieval , 2007, CIVR '07.

[35]  Shih-Fu Chang,et al.  Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .