暂无分享,去创建一个
Ernest Valveny | Suman K. Ghosh | Arka Ujjal Dey | Suman Kumar Ghosh | Arka Ujjal Dey | Ernest Valveny
[2] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.
[3] Mingda Zhang,et al. Automatic Understanding of Image and Video Advertisements , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Gang Hua,et al. Hierarchical Multimodal LSTM for Dense Visual-Semantic Embedding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[5] Mingda Zhang,et al. Equal But Not The Same: Understanding the Implicit Relationship Between Persuasive Images and Text , 2018, BMVC.
[6] Shuang Bai,et al. A survey on automatic image caption generation , 2018, Neurocomputing.
[7] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[8] Anders Brun,et al. Semantic and Verbatim Word Spotting Using Deep Neural Networks , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).
[9] Naila Murray,et al. LEWIS: Latent Embeddings for Word Images and Their Semantics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[10] Xiaolin Li,et al. Single Shot Text Detector with Regional Attention , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[11] Adriana Kovashka,et al. ADVISE: Symbolism and External Knowledge for Decoding Advertisements , 2017, ECCV.
[12] Jiebo Luo,et al. Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification , 2017, IEEE Access.
[13] Theo Gevers,et al. Con-Text: Text Detection for Fine-Grained Object Classification , 2017, IEEE Transactions on Image Processing.
[14] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.
[15] Anders Brun,et al. Semantic and Verbatim Word Spotting Using Deep Neural Networks , 2016, ICFHR 2016.
[16] Theo Gevers,et al. Con-text: text detection using background connectivity for fine-grained object classification , 2013, ACM Multimedia.
[17] Trevor Darrell,et al. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding , 2016, EMNLP.
[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[20] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[21] Chiranjib Bhattacharyya,et al. Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks , 2018, ACL.
[22] Matthieu Cord,et al. MUTAN: Multimodal Tucker Fusion for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[23] Khyathi Raghavi Chandu,et al. Textually Enriched Neural Module Networks for Visual Question Answering , 2018, ArXiv.
[24] Ajay Divakaran,et al. Understanding Visual Ads by Aligning Symbols and Objects using Co-Attention , 2018, ArXiv.
[25] Shuchang Zhou,et al. EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[27] Sergio Guadarrama,et al. Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Xinlei Chen,et al. Towards VQA Models That Can Read , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Pietro Liò,et al. Graph Attention Networks , 2017, ICLR.
[30] Alex Graves,et al. Recurrent Models of Visual Attention , 2014, NIPS.
[31] Ernest Valveny,et al. Scene Text Visual Question Answering , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[32] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[33] Dan Klein,et al. Learning to Compose Neural Networks for Question Answering , 2016, NAACL.
[34] Wenyu Liu,et al. TextBoxes: A Fast Text Detector with a Single Deep Neural Network , 2016, AAAI.
[35] David J. Fleet,et al. VSE++: Improved Visual-Semantic Embeddings , 2017, ArXiv.
[36] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.
[37] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[38] Yu Cheng,et al. Relation-Aware Graph Attention Network for Visual Question Answering , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[39] Lluis Gomez,et al. Exploring Hate Speech Detection in Multimodal Publications , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[40] Xiang Bai,et al. Robust Scene Text Recognition with Automatic Rectification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Xin He,et al. Scene Text Detection and Recognition: The Deep Learning Era , 2018, International Journal of Computer Vision.
[42] Arnold W. M. Smeulders,et al. Words Matter: Scene Text for Image Classification and Retrieval , 2017, IEEE Transactions on Multimedia.
[43] Li Fei-Fei,et al. DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Ernest Valveny,et al. Don't only Feel Read: Using Scene text to understand advertisements , 2018, ArXiv.
[45] Andrew Zisserman,et al. Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.
[46] Junjie Yan,et al. FOTS: Fast Oriented Text Spotting with a Unified Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.