Semi-supervised vision-language mapping via variational learning
暂无分享,去创建一个
Ling Shao | Li Zhang | Yuming Shen | Yuming Shen | Li Zhang | Ling Shao
[1] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[2] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[3] Samy Bengio,et al. A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] Ling Shao,et al. Sequential Discrete Hashing for Scalable Cross-Modality Similarity Retrieval , 2017, IEEE Transactions on Image Processing.
[5] Joshua B. Tenenbaum,et al. Deep Convolutional Inverse Graphics Network , 2015, NIPS.
[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[7] Ling Shao,et al. Latent Structure Preserving Hashing , 2017, International Journal of Computer Vision.
[8] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[9] Rabab Kreidieh Ward,et al. Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Ling Shao,et al. Multiview Alignment Hashing for Efficient Image Search , 2015, IEEE Transactions on Image Processing.
[11] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.
[12] Ling Shao,et al. Projection Bank: From High-Dimensional Data to Medium-Length Binary Codes , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[13] Trevor Darrell,et al. Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Xing Xu,et al. Semi-supervised Coupled Dictionary Learning for Cross-modal Retrieval in Internet Images and Texts , 2015, ACM Multimedia.
[15] Mario Fritz,et al. Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[16] Ling Shao,et al. Sequential Compact Code Learning for Unsupervised Image Hashing , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[17] Armand Joulin,et al. Deep Fragment Embeddings for Bidirectional Image Sentence Mapping , 2014, NIPS.
[18] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[19] Ling Shao,et al. Binary Set Embedding for Cross-Modal Retrieval , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[20] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[21] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[22] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[23] Honglak Lee,et al. Improved Multimodal Deep Learning with Variation of Information , 2014, NIPS.
[24] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.
[25] Peter Young,et al. Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics , 2013, J. Artif. Intell. Res..
[26] Li Fei-Fei,et al. DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[28] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Wei Xu,et al. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) , 2014, ICLR.
[30] Ling Shao,et al. Hetero-Manifold Regularisation for Cross-Modal Hashing , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[31] Yueting Zhuang,et al. Learning Multimodal Neural Network with Ranking Examples , 2014, ACM Multimedia.
[32] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..
[33] Peter Young,et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.
[34] Wei Xu,et al. Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question , 2015, NIPS.
[35] Yueting Zhuang,et al. Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment , 2015, ACM Multimedia.
[36] Wojciech Zaremba,et al. Learning to Execute , 2014, ArXiv.