暂无分享,去创建一个
Sandro Pezzelle | Raquel Fernández | Lisa Beinborn | Ece Takmaz | Sandro Pezzelle | R. Fernández | Lisa Beinborn | Ece Takmaz
[1] L. Gleitman,et al. On the give and take between event apprehension and utterance formulation. , 2007, Journal of memory and language.
[2] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.
[3] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[4] Nazli Ikizler-Cinbis,et al. Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures , 2016, J. Artif. Intell. Res..
[5] Moreno I. Coco,et al. Scan pattern in visual scenes predict Sentence production , 2010 .
[6] Nazli Ikizler-Cinbis,et al. Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures (Extended Abstract) , 2017, IJCAI.
[7] Jeff B. Pelz,et al. SNAG: Spoken Narratives and Gaze Dataset , 2018, ACL.
[8] Yusuke Sugano,et al. Seeing with Humans: Gaze-Assisted Neural Image Captioning , 2016, ArXiv.
[9] Ali Borji,et al. Human Attention in Image Captioning: Dataset and Analysis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[10] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[11] Basura Fernando,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.
[12] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[13] Z. Griffin. Why Look? Reasons for Eye Movements Related to Language Production. , 2004 .
[14] Bernt Schiele,et al. Gaze Embeddings for Zero-Shot Image Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Matthew W. Crocker,et al. The influence of speaker gaze on listener comprehension: Contrasting visual versus intentional accounts , 2014, Cognition.
[17] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[18] Aïda Valls,et al. A Similarity Measure for Sequences of Categorical Data Based on the Ordering of Common Elements , 2008, MDAI.
[19] E. Miller,et al. Top-Down Versus Bottom-Up Control of Attention in the Prefrontal and Posterior Parietal Cortices , 2007, Science.
[20] Moreno I. Coco,et al. Scan Patterns Predict Sentence Production in the Cross-Modal Processing of Visual Scenes , 2012, Cogn. Sci..
[21] Moreno I. Coco,et al. Integrating mechanisms of visual guidance in naturalistic language production , 2015, Cognitive Processing.
[22] KellerFrank,et al. Automatic description generation from images , 2016 .
[23] Antonio Torralba,et al. Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.
[24] Lei Zhang,et al. Bottom-Up and Top-Down Attention for Image Captioning and VQA , 2017, ArXiv.
[25] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[26] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Walter Daelemans,et al. Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource , 2016, LREC.
[28] Chen Chen,et al. Improving Image Captioning with Conditional Generative Adversarial Nets , 2018, AAAI.
[29] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Mert Kilickaya,et al. Re-evaluating Automatic Metrics for Image Captioning , 2016, EACL.
[31] Luc Van Gool,et al. Object Referring in Videos with Language and Human Gaze , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Joachim Bingel,et al. Sequence Classification with Human Attention , 2018, CoNLL.
[33] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[34] Jongwook Choi,et al. Supervising Neural Attention Models for Video Captioning by Human Gaze Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Joyce Yue Chai,et al. Incorporating Temporal and Semantic Information with Eye Gaze for Automatic Word Acquisition in Multimodal Conversational Systems , 2008, EMNLP.
[36] Zenzi M. Griffin,et al. PSYCHOLOGICAL SCIENCE Research Article WHAT THE EYES SAY ABOUT SPEAKING , 2022 .
[37] A. L. Yarbus. Eye Movements During Perception of Complex Objects , 1967 .
[38] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[39] Jeff B. Pelz,et al. Alignment of Eye Movements and Spoken Language for Semantic Image Understanding , 2015, IWCS.
[40] Yang Feng,et al. Unsupervised Image Captioning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Sigrid Klerke,et al. At a Glance: The Impact of Gaze Aggregation Views on Syntactic Tagging , 2019, EMNLP.
[42] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[43] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[44] Takenobu Tokunaga,et al. Incrementally Tracking Reference in Human/Human Dialogue Using Linguistic and Extra-Linguistic Information , 2015, NAACL.
[45] Frank Keller,et al. Training Object Class Detectors from Eye Tracking Data , 2014, ECCV.
[46] Joachim Bingel,et al. Weakly Supervised Part-of-speech Tagging Using Eye-tracking Data , 2016, ACL.
[47] Fei Liu,et al. MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance , 2019, EMNLP.
[48] Qi Zhao,et al. Boosted Attention: Leveraging Human Attention for Image Captioning , 2018, ECCV.
[49] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[50] Emiel Krahmer,et al. DIDEC: The Dutch Image Description and Eye-tracking Corpus , 2018, COLING.
[51] G. T. Buswell. How People Look At Pictures: A Study Of The Psychology Of Perception In Art , 2012 .
[52] Ted Pedersen,et al. WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.
[53] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.