BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment
暂无分享,去创建一个
[1] Jonathan G. Fiscus,et al. TRECVID 2018: Benchmarking Video Activity Detection, Video Captioning and Matching, Video Storytelling Linking and Video Search , 2018, TRECVID.
[2] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[3] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Jinglu Hu,et al. Improving Image Captioning Evaluation by Considering Inter References Variance , 2020, ACL.
[5] Bernard Ghanem,et al. ActivityNet: A large-scale video benchmark for human activity understanding , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Paul Over,et al. Evaluation campaigns and TRECVid , 2006, MIR '06.
[7] Yansong Tang,et al. COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Markus Freitag,et al. Results of the WMT20 Metrics Shared Task , 2020, WMT.
[9] Serge J. Belongie,et al. Learning to Evaluate Image Captioning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[10] Georges Quénot,et al. TRECVID 2017: Evaluating Ad-hoc and Instance Video Search, Events Detection, Video Captioning and Hyperlinking , 2017, TRECVID.
[11] Jonathan G. Fiscus,et al. TRECVID 2019: An evaluation campaign to benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & retrieval , 2019, TRECVID.
[12] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[13] Basura Fernando,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.
[14] Jonathan G. Fiscus,et al. TRECVID 2020: A comprehensive campaign for evaluating video retrieval tasks across multiple application domains , 2021, TRECVID.
[15] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[16] George Awad,et al. Evaluation of automatic video captioning using direct assessment , 2017, PloS one.
[17] Mamoru Komachi,et al. Machine Translation Evaluation with BERT Regressor , 2019, ArXiv.
[18] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[19] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[20] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[21] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[22] Timothy Baldwin,et al. Accurate Evaluation of Segment-level Machine Translation Metrics , 2015, NAACL.
[23] Xinlei Chen,et al. Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.
[24] Thibault Sellam,et al. BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.
[25] Yvette Graham,et al. Re-evaluating Automatic Summarization with BLEU and 192 Shades of ROUGE , 2015, EMNLP.
[26] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.