Image Search with Text Feedback by Deep Hierarchical Attention Mutual Information Maximization
暂无分享,去创建一个
Wei Wang | Jiajun Bu | Chunbin Gu | Zhen Zhang | Zhi Yu | Dongfang Ma | Wei Wang | Jiajun Bu | Zhi Yu | Zhen Zhang | Chunbin Gu | Dongfang Ma
[1] Bo Zhao,et al. Memory-Augmented Attribute Manipulation Networks for Interactive Fashion Search , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] James Hays,et al. The sketchy database , 2016, ACM Trans. Graph..
[3] Suzanna Becker,et al. Mutual information maximization: models of cortical self-organization. , 1996, Network.
[4] Qirong Mao,et al. Joint Attribute Manipulation and Modality Alignment Learning for Composing Text and Image to Image Retrieval , 2020, ACM Multimedia.
[5] S. Varadhan,et al. Asymptotic evaluation of certain Markov process expectations for large time , 1975 .
[6] Rohan Ramanath,et al. An Attentive Survey of Attention Models , 2019, ACM Trans. Intell. Syst. Technol..
[7] Bart Thomee,et al. Interactive search in image retrieval: a survey , 2012, International Journal of Multimedia Information Retrieval.
[8] Shaohua Kevin Zhou,et al. Deep Networks and Mutual Information Maximization for Cross-Modal Medical Image Synthesis , 2017, Deep Learning for Medical Image Analysis.
[9] Jie Chen,et al. Attention on Attention for Image Captioning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[10] Alexander C. Berg,et al. Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.
[11] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Hui Wang,et al. Bootstrap dual complementary hashing with semi-supervised re-ranking for image retrieval , 2020, Neurocomputing.
[13] Kristen Grauman,et al. Relative attributes , 2011, 2011 International Conference on Computer Vision.
[14] Kristen Grauman,et al. Attributes as Operators , 2018, ECCV.
[15] R Devon Hjelm,et al. Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.
[16] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.
[17] Xiaogang Wang,et al. DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[19] Aaron C. Courville,et al. FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.
[20] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[21] Peng Gao,et al. Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Yann LeCun,et al. Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[23] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[24] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Rogério Schmidt Feris,et al. Dialog-based Interactive Image Retrieval , 2018, NeurIPS.
[26] Thomas S. Huang,et al. Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.
[27] Abhinav Gupta,et al. Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[28] Björn Ommer,et al. Cross and Learn: Cross-Modal Self-Supervision , 2018, GCPR.
[29] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[30] Xiaogang Wang,et al. Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[31] Quoc V. Le,et al. Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.
[32] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[33] Steven C. H. Hoi,et al. Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Shaogang Gong,et al. Image Search With Text Feedback by Visiolinguistic Attention Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Thomas S. Huang,et al. Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..
[36] Sebastian Nowozin,et al. f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.
[37] Yupeng Gao,et al. Fashion IQ: A New Dataset towards Retrieving Images by Natural Language Feedback , 2019 .
[38] Xiangwei Kong,et al. Learning Disentangled Representation for Cross-Modal Retrieval with Deep Mutual Information Estimation , 2019, ACM Multimedia.
[39] Dustin Tran,et al. Image Transformer , 2018, ICML.
[40] Jiajun Bu,et al. Cross-modal Image Retrieval with Deep Mutual Information Maximization , 2021, Neurocomputing.
[41] Takayuki Okatani,et al. Improved Fusion of Visual and Language Representations by Dense Symmetric Co-attention for Visual Question Answering , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[42] Bohyung Han,et al. Image Question Answering Using Convolutional Neural Network with Dynamic Parameter Prediction , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Aaron C. Courville,et al. MINE: Mutual Information Neural Estimation , 2018, ArXiv.
[44] Yang Yang,et al. Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.
[45] Adriana Kovashka,et al. Attribute Pivots for Guiding Relevance Feedback in Image Search , 2013, 2013 IEEE International Conference on Computer Vision.
[46] Loris Bazzani,et al. Learning Joint Visual Semantic Matching Embeddings for Language-Guided Retrieval , 2020, ECCV.
[47] Yin Li,et al. Learning Deep Structure-Preserving Image-Text Embeddings , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Stan Sclaroff,et al. Deep Metric Learning to Rank , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Adriana Kovashka,et al. WhittleSearch: Image search with relative attribute feedback , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[50] Razvan Pascanu,et al. A simple neural network module for relational reasoning , 2017, NIPS.
[51] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[52] Albert Gordo,et al. Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.
[53] Tao Xiang,et al. Generalising Fine-Grained Sketch-Based Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[55] Yingli Tian,et al. Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[56] Byoung-Tak Zhang,et al. Multimodal Residual Learning for Visual QA , 2016, NIPS.
[57] Ashish Vaswani,et al. Stand-Alone Self-Attention in Vision Models , 2019, NeurIPS.
[58] Aapo Hyvärinen,et al. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..
[59] Li Fei-Fei,et al. Composing Text and Image for Image Retrieval - an Empirical Odyssey , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Kristen Grauman,et al. Thinking Outside the Pool: Active Training Image Creation for Relative Attributes , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[61] Jo Yew Tham,et al. Learning Attribute Representations with Localization for Flexible Fashion Search , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[62] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[63] Yan Yan,et al. Multi-Level Visual-Semantic Alignments with Relation-Wise Dual Attention Network for Image and Text Matching , 2019, IJCAI.
[64] J. Kinney,et al. Equitability, mutual information, and the maximal information coefficient , 2013, Proceedings of the National Academy of Sciences.
[65] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[66] Phillip Isola,et al. Contrastive Multiview Coding , 2019, ECCV.
[67] Helen Suzanna Becker,et al. An information-theoretic unsupervised learning algorithm for neural networks , 1993 .
[68] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[69] Larry S. Davis,et al. Automatic Spatially-Aware Fashion Concept Discovery , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[70] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[71] Zhihai He,et al. Hybrid representation learning for cross-modal retrieval , 2019, Neurocomputing.
[72] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.