Explaining Vision and Language through Graphs of Events in Space and Time
暂无分享,去创建一个
[1] Noah A. Smith,et al. How Language Model Hallucinations Can Snowball , 2023, ArXiv.
[2] Jing Liu,et al. VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset , 2023, ArXiv.
[3] Humphrey Shi,et al. Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[4] D. Erhan,et al. Phenaki: Variable Length Video Generation From Open Domain Textual Description , 2022, ICLR.
[5] Yaniv Taigman,et al. Make-A-Video: Text-to-Video Generation without Text-Video Data , 2022, ICLR.
[6] Wendi Zheng,et al. CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers , 2022, ICLR.
[7] Tim K. Marks,et al. (2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering , 2022, AAAI.
[8] Jian Liang,et al. NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion , 2021, ECCV.
[9] Marius Leordeanu,et al. A hierarchical approach to vision-based language generation: from simple sentences to complex natural language , 2020, COLING.
[10] Thibault Sellam,et al. BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.
[11] Junchi Yan,et al. Neural Graph Matching Network: Learning Lawler’s Quadratic Assignment Problem With Extension to Hypergraph and Multiple-Graph Matching , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] Rama Chellappa,et al. Conditional GAN with Discriminative Filter Generation for Text-to-Video Synthesis , 2019, IJCAI.
[13] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.
[14] Jiachen Li,et al. Text Guided Person Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Abhinav Gupta,et al. Videos as Space-Time Region Graphs , 2018, ECCV.
[16] Luowei Zhou,et al. End-to-End Dense Video Captioning with Masked Transformer , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[17] Wei Liu,et al. Reconstruction Network for Video Captioning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[18] Yitong Li,et al. Video Generation From Text , 2017, AAAI.
[19] Abhinav Gupta,et al. Temporal Dynamic Graph LSTM for Action-Driven Video Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[20] Heng Tao Shen,et al. Video Captioning With Attention-Based LSTM and Semantic Consistency , 2017, IEEE Transactions on Multimedia.
[21] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[22] C. Krishna Mohan,et al. Graph formulation of video activities for abnormal activity recognition , 2017, Pattern Recognit..
[23] Stephen Gould,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.
[24] Jiasen Lu,et al. Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.
[25] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.
[26] Margaret Mitchell,et al. VQA: Visual Question Answering , 2015, International Journal of Computer Vision.
[27] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[29] Philipp Koehn,et al. Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.
[30] Oren Etzioni,et al. Towards Coherent Multi-Document Summarization , 2013, NAACL.
[31] William Brendel,et al. Learning spatiotemporal graphs of human activities , 2011, 2011 International Conference on Computer Vision.
[32] Anthony G. Cohn,et al. Relational Graph Mining for Learning Events from Video , 2010, STAIRS.
[33] Martial Hebert,et al. A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[34] Luke S. Zettlemoyer,et al. Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.
[35] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[36] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[37] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[38] Dekang Lin,et al. A dependency-based method for evaluating broad-coverage parsers , 1995, Natural Language Engineering.
[39] William C. Mann,et al. Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .
[40] Chen Shen,et al. Self-Adaptive Neural Module Transformer for Visual Question Answering , 2021, IEEE Transactions on Multimedia.