Content-Based Video Relevance Prediction with Second-Order Relevance and Attention Modeling

This paper describes our proposed method for the Content-Based Video Relevance Prediction (CBVRP) challenge. Our method is based on deep learning, i.e. we train a deep network to predict the relevance between two video sequences from their features. We explore the usage of second-order relevance, both in preparing training data, and in extending the deep network. Second-order relevance refers to e.g. the relevance between x and z if x is relevant to y and y is relevant to z. In our proposed method, we use second-order relevance to increase positive samples and decrease negative samples, when preparing training data. We further extend the deep network with an attention module, where the attention mechanism is designed for second-order relevant video sequences. We verify the effectiveness of our method on the validation set of the CBVRP challenge.

[1]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[2]  Franca Garzotto,et al.  Content-Based Video Recommendation System Based on Stylistic Visual Features , 2016, Journal on Data Semantics.

[3]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[4]  Apostol Natsev,et al.  YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.

[5]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[6]  Hanning Zhou,et al.  A Neural Autoregressive Approach to Collaborative Filtering , 2016, ICML.

[7]  Dit-Yan Yeung,et al.  Collaborative Deep Learning for Recommender Systems , 2014, KDD.

[8]  Tao Mei,et al.  Contextual Video Recommendation by Multimodal Relevance and User Feedback , 2011, TOIS.

[9]  Fei-Fei Li,et al.  Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Yan Li,et al.  A study on content-based video recommendation , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[11]  Haohong Wang,et al.  VideoTopic: Content-Based Video Recommendation Using a Topic Model , 2013, 2013 IEEE International Symposium on Multimedia.

[12]  Xiaohui Xie,et al.  Content-based Video Relevance Prediction Challenge: Data, Protocol, and Baseline , 2018, ArXiv.

[13]  Hanning Zhou,et al.  Neural Autoregressive Collaborative Filtering for Implicit Feedback , 2016, DLRS@RecSys.

[14]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).