Affective Video Content Analyses by Using Cross-Modal Embedding Learning Features
暂无分享,去创建一个
[1] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[2] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Emmanuel Dellandréa,et al. The MediaEval 2016 Emotional Impact of Movies Task , 2016, MediaEval.
[4] Qiang Ji,et al. A Multimodal Deep Regression Bayesian Network for Affective Video Content Analyses , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[5] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Ping Hu,et al. Learning supervised scoring ensemble for emotion recognition in the wild , 2017, ICMI.
[8] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[9] Apostol Natsev,et al. YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.
[10] Shiliang Zhang,et al. Affective Visualization and Retrieval for Music Video , 2010, IEEE Transactions on Multimedia.
[11] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.
[12] Amaia Salvador,et al. Learning Cross-Modal Embeddings for Cooking Recipes and Food Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Anil A. Bharath,et al. Adversarial Training for Sketch Retrieval , 2016, ECCV Workshops.
[14] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[15] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Antonio Torralba,et al. See, Hear, and Read: Deep Aligned Representations , 2017, ArXiv.
[17] Yaxin Wang,et al. Exploring Domain Knowledge for Affective Video Content Analyses , 2017, ACM Multimedia.
[18] Frank Hopfgartner,et al. Understanding Affective Content of Music Videos through Learned Representations , 2014, MMM.
[19] Emmanuel Dellandréa,et al. LIRIS-ACCEDE: A Video Database for Affective Content Analysis , 2015, IEEE Transactions on Affective Computing.
[20] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.
[21] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[22] Emmanuel Dellandréa,et al. The MediaEval 2015 Affective Impact of Movies Task , 2015, MediaEval.