Video associated cross-modal recommendation algorithm based on deep learning

Abstract The results returned by traditional video recommendations are mostly limited to mono-modality content. However, video contains multi-modal contents and users may also be interested in other modality contents related to video. Combined with the requirements of “recommending other media according to users’ preference for video data”, this paper proposed a new cross-modal recommendation method based on multi-modal deep learning. In this study, a common representation between multi-modal features was established, the relationship between each modality feature on high-level semantic features was learned, and a cross-modal recommendation algorithm based on multi-modal deep learning was proposed. The experimental results on the standard data set verify that the algorithm can effectively implement the multi-modal recommendations, and provide multi-modal recommendation results associated with video features

[1]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[2]  Bir Bhanu,et al.  Object detection in multi-modal images using genetic programming , 2004, Appl. Soft Comput..

[3]  Ling Tang,et al.  A DBN-based resampling SVM ensemble learning paradigm for credit classification with imbalanced data , 2018, Appl. Soft Comput..

[4]  Wei Chen,et al.  Making recommendations from multiple domains , 2013, KDD.

[5]  David W. Jacobs,et al.  Generalized Multiview Analysis: A discriminative latent space , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Luiz Eduardo Soares de Oliveira,et al.  An evaluation of Convolutional Neural Networks for music classification using spectrograms , 2017, Appl. Soft Comput..

[8]  M. Shamim Hossain,et al.  Cross-Platform Multi-Modal Topic Modeling for Personalized Inter-Platform Recommendation , 2015, IEEE Transactions on Multimedia.

[9]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[10]  Nitesh Kumar,et al.  Semantic clustering-based cross-domain recommendation , 2014, 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[11]  Cees Snoek,et al.  Discovering Semantic Vocabularies for Cross-Media Retrieval , 2015, ICMR.

[12]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[13]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[14]  Wei Zeng,et al.  A unified framework for recommending items, groups and friends in social media environment via mutual resource fusion , 2013, Expert Syst. Appl..

[15]  Kandarpa Kumar Sarma,et al.  An ANN based approach to recognize initial phonemes of spoken words of Assamese language , 2013, Appl. Soft Comput..

[16]  Shahrel Azmin Suandi,et al.  Hierarchical Skin-AdaBoost-Neural Network (H-SKANN) for multi-face detection , 2018, Appl. Soft Comput..

[17]  Trevor Darrell,et al.  Factorized Multi-Modal Topic Model , 2012, UAI.

[18]  Yueting Zhuang,et al.  Multi-modal Mutual Topic Reinforce Modeling for Cross-media Retrieval , 2014, ACM Multimedia.

[19]  Paolo Cremonesi,et al.  Tutorial on cross-domain recommender systems , 2014, RecSys '14.

[20]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[21]  Bin Li,et al.  Cross-Domain Collaborative Filtering: A Brief Survey , 2011, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence.

[22]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[23]  José García Rodríguez,et al.  A survey on deep learning techniques for image and video semantic segmentation , 2018, Appl. Soft Comput..

[24]  Jia Zhang,et al.  Cross Media Recommendation in Digital Library , 2014, ICADL.

[25]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[26]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[27]  Daniel Gatica-Perez,et al.  Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Chun Chen,et al.  Cross domain recommendation based on multi-type media fusion , 2014, Neurocomputing.

[29]  Roi Blanco,et al.  Learning Relevance of Web Resources across Domains to Make Recommendations , 2013, 2013 12th International Conference on Machine Learning and Applications.

[30]  Jun Wang,et al.  Cross-Domain Collaborative Recommendation by Transfer Learning of Heterogeneous Feedbacks , 2015, WISE.

[31]  Hagai Attias,et al.  Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..