Context-aware transformer for image captioning

[1]  R. Ji,et al.  Towards Local Visual Modeling for Image Captioning , 2023, Pattern Recognit..

[2]  Jie Li,et al.  CSTNET: Enhancing Global-To-Local Interactions for Image Captioning , 2022, 2022 IEEE International Conference on Image Processing (ICIP).

[3]  Wei‐Chiang Hong,et al.  A hybrid approach for forecasting ship motion using CNN-GRU-AM and GCWOA , 2021, Appl. Soft Comput..

[4]  Chi Wang,et al.  Geometry Attention Transformer with Position-aware LSTMs for Image Captioning , 2021, Expert Syst. Appl..

[5]  Jungang Xu,et al.  A visual persistence model for image captioning , 2021, Neurocomputing.

[6]  Xiaofei He,et al.  TelecomNet: Tag-Based Weakly-Supervised Modally Cooperative Hashing Network for Image Retrieval , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Oier Lopez de Lacalle,et al.  Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering , 2021, Expert Syst. Appl..

[8]  Ying Wang,et al.  Visual relationship detection with region topology structure , 2021, Inf. Sci..

[9]  Liujuan Cao,et al.  Dual-Level Collaborative Transformer for Image Captioning , 2021, AAAI.

[10]  Yongjian Wu,et al.  Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network , 2020, AAAI.

[11]  Jianlong Tan,et al.  Image Captioning with Context-Aware Auxiliary Guidance , 2020, AAAI Conference on Artificial Intelligence.

[12]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[13]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[14]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..