Reinforced Cross-Media Correlation Learning by Context-Aware Bidirectional Translation
暂无分享,去创建一个
[1] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Xin Huang,et al. An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges , 2017, IEEE Transactions on Circuits and Systems for Video Technology.
[3] C. V. Jawahar,et al. Multi-label Cross-Modal Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[4] Qi Tian,et al. Pooling the Convolutional Layers in Deep ConvNets for Video Action Recognition , 2015, IEEE Transactions on Circuits and Systems for Video Technology.
[5] Yuxin Peng,et al. Cross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks , 2016, IJCAI.
[6] Yue Gao,et al. Attribute-Augmented Semantic Hierarchy: Towards a Unified Framework for Content-Based Image Retrieval , 2014, TOMM.
[7] Ruifan Li,et al. Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.
[8] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[9] Changsheng Xu,et al. Learning Consistent Feature Representation for Cross-Modal Multimedia Retrieval , 2015, IEEE Transactions on Multimedia.
[10] Ishwar K. Sethi,et al. Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.
[11] Yuxin Peng,et al. CCL: Cross-modal Correlation Learning With Multigrained Fusion by Hierarchical Network , 2017, IEEE Transactions on Multimedia.
[12] Qi Tian,et al. Sequential Video VLAD: Training the Aggregation Locally and Temporally , 2018, IEEE Transactions on Image Processing.
[13] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.
[14] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[15] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[16] Ning Zhang,et al. Deep Reinforcement Learning-Based Image Captioning with Embedding Reward , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Yueting Zhuang,et al. Task-driven Visual Saliency and Attention-based Visual Question Answering , 2017, ArXiv.
[18] Svetlana Lazebnik,et al. Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[19] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[20] Yongdong Zhang,et al. Context-Aware Visual Policy Network for Sequence-Level Image Captioning , 2018, ACM Multimedia.
[21] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[22] Yuxin Peng,et al. Modality-Specific Cross-Modal Similarity Measurement With Recurrent Attention Network , 2017, IEEE Transactions on Image Processing.
[23] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[24] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[25] Yuxin Peng,et al. CM-GANs , 2019, ACM Trans. Multim. Comput. Commun. Appl..
[26] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[27] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[28] Yao Zhao,et al. Cross-Modal Retrieval With CNN Visual Features: A New Baseline , 2017, IEEE Transactions on Cybernetics.
[29] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[30] Cyrus Rashtchian,et al. Collecting Image Annotations Using Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.
[31] Xiaohua Zhai,et al. Learning Cross-Media Joint Representation With Sparse and Semisupervised Regularization , 2014, IEEE Transactions on Circuits and Systems for Video Technology.
[32] Yuxin Peng,et al. Deep Cross-Media Knowledge Transfer , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[33] Xiaohua Zhai,et al. Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval , 2013, AAAI.
[34] Nitish Srivastava,et al. Learning Representations for Multimodal Data with Deep Belief Nets , 2012 .
[35] Chong-Wah Ngo,et al. Deep Multimodal Learning for Affective Analysis and Retrieval , 2015, IEEE Transactions on Multimedia.
[36] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[37] Krystian Mikolajczyk,et al. Deep correlation for matching images and text , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[39] Michael Isard,et al. A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.
[40] Tieniu Tan,et al. Joint Feature Selection and Subspace Learning for Cross-Modal Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[41] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[42] Roger Levy,et al. A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.
[43] Gang Wang,et al. Convolutional recurrent neural networks: Learning spatial dependencies for image representation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[44] WangWei,et al. Effective multi-modal retrieval based on stacked auto-encoders , 2014, VLDB 2014.
[45] Tie-Yan Liu,et al. Dual Learning for Machine Translation , 2016, NIPS.
[46] Yang Yang,et al. Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.
[47] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.