Modality-Specific Cross-Modal Similarity Measurement With Recurrent Attention Network
暂无分享,去创建一个
Yuxin Peng | Jinwei Qi | Yuxin Yuan | Yuxin Peng | Jinwei Qi | Yuxin Yuan
[1] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[2] Xiaohua Zhai,et al. Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval , 2013, AAAI.
[3] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[4] Nitish Srivastava,et al. Learning Representations for Multimodal Data with Deep Belief Nets , 2012 .
[5] Xin Huang,et al. An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges , 2017, IEEE Transactions on Circuits and Systems for Video Technology.
[6] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[7] Xuelong Li,et al. Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.
[8] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[9] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[10] Roger Levy,et al. A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.
[11] Xiaohua Zhai,et al. Semi-Supervised Cross-Media Feature Learning With Unified Patch Graph Regularization , 2016, IEEE Transactions on Circuits and Systems for Video Technology.
[12] Changsheng Xu,et al. Learning Consistent Feature Representation for Cross-Modal Multimedia Retrieval , 2015, IEEE Transactions on Multimedia.
[13] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[14] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[15] Meng Wang,et al. Neighborhood Discriminant Hashing for Large-Scale Image Retrieval , 2015, IEEE Transactions on Image Processing.
[16] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[17] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Phil Blunsom,et al. Reasoning about Entailment with Neural Attention , 2015, ICLR.
[19] Jian Wang,et al. Cross-Modal Retrieval via Deep and Bidirectional Representation Learning , 2016, IEEE Transactions on Multimedia.
[20] Liang Wang,et al. Cross-Modal Subspace Learning via Pairwise Constraints , 2014, IEEE Transactions on Image Processing.
[21] Michael Isard,et al. A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics , 2012, International Journal of Computer Vision.
[22] Tieniu Tan,et al. Joint Feature Selection and Subspace Learning for Cross-Modal Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[23] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[24] C. V. Jawahar,et al. Multi-label Cross-Modal Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[25] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[26] Krystian Mikolajczyk,et al. Deep correlation for matching images and text , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Alexander J. Smola,et al. Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Xinbo Gao,et al. Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval , 2016, IEEE Transactions on Image Processing.
[29] Yueting Zhuang,et al. Learning of Multimodal Representations With Random Walks on the Click Graph , 2016, IEEE Transactions on Image Processing.
[30] Roger Levy,et al. On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[31] Mikhail Belkin,et al. Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.
[32] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[33] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[34] Yueting Zhuang,et al. Cross-Modal Learning to Rank via Latent Joint Representation , 2015, IEEE Transactions on Image Processing.
[35] Yuxin Peng,et al. Cross-Media Shared Representation by Hierarchical Learning with Multiple Deep Networks , 2016, IJCAI.
[36] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.
[37] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[38] Lior Wolf,et al. Associating neural word embeddings with deep image representations using Fisher Vectors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[40] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[41] Yao Zhao,et al. Cross-Modal Retrieval With CNN Visual Features: A New Baseline , 2017, IEEE Transactions on Cybernetics.
[42] Yueting Zhuang,et al. Task-driven Visual Saliency and Attention-based Visual Question Answering , 2017, ArXiv.
[43] Qi Tian,et al. Coarse2Fine: Two-Layer Fusion For Image Retrieval , 2016, ArXiv.
[44] Ruifan Li,et al. Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.
[45] Ishwar K. Sethi,et al. Multimedia content processing through cross-modal association , 2003, MULTIMEDIA '03.
[46] Yuxin Peng,et al. CM-GANs , 2019, ACM Trans. Multim. Comput. Commun. Appl..
[47] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.
[48] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[49] Xi Wang,et al. Multi-Stream Multi-Class Fusion of Deep Networks for Video Classification , 2016, ACM Multimedia.
[50] Cyrus Rashtchian,et al. Collecting Image Annotations Using Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.
[51] Xiaohua Zhai,et al. Learning Cross-Media Joint Representation With Sparse and Semisupervised Regularization , 2014, IEEE Transactions on Circuits and Systems for Video Technology.
[52] Yuxin Peng,et al. Clip-based similarity measure for query-dependent clip retrieval and video summarization , 2006, IEEE Trans. Circuits Syst. Video Technol..