StructChart: Perception, Structuring, Reasoning for Visual Chart Understanding
暂无分享,去创建一个
Junchi Yan | Y. Qiao | Hao Peng | Peng Ye | Botian Shi | Bo Zhang | Renqiu Xia | Ning Liao | Hancheng Ye
[1] Jinwoo Shin,et al. STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables , 2023, ICLR.
[2] Julian Martin Eisenschlos,et al. DePlot: One-shot visual language reasoning by plot-to-table translation , 2022, ACL.
[3] Julian Martin Eisenschlos,et al. MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering , 2022, ACL.
[4] Andrew M. Dai,et al. Scaling Instruction-Finetuned Language Models , 2022, ArXiv.
[5] D. Sontag,et al. TabLLM: Few-shot Classification of Tabular Data with Large Language Models , 2022, AISTATS.
[6] Julian Martin Eisenschlos,et al. Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding , 2022, ICML.
[7] Ashish V. Thapliyal,et al. PaLI: A Jointly-Scaled Multilingual Language-Image Model , 2022, ICLR.
[8] Shafiq R. Joty,et al. ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning , 2022, FINDINGS.
[9] Haifeng Wang,et al. UNIMO-2: End-to-End Unified Vision-Language Grounded Learning , 2022, FINDINGS.
[10] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[11] Marcus Rohrbach,et al. FLAVA: A Foundational Language And Vision Alignment Model , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Ross B. Girshick,et al. Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Li Dong,et al. VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts , 2021, NeurIPS.
[14] Junjie Yan,et al. Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm , 2021, ICLR.
[15] Adams Wei Yu,et al. SimVLM: Simple Visual Language Model Pretraining with Weak Supervision , 2021, ICLR.
[16] C. Lee Giles,et al. ChartReader: Automatic Parsing of Bar-Plots , 2021, 2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI).
[17] Junnan Li,et al. Align before Fuse: Vision and Language Representation Learning with Momentum Distillation , 2021, NeurIPS.
[18] Jianlong Fu,et al. Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training , 2021, NeurIPS.
[19] Jianlong Fu,et al. Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Dragomir R. Radev,et al. FeTaQA: Free-form Table Question Answering , 2021, TACL.
[21] Zhiwu Lu,et al. WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training , 2021, ArXiv.
[22] Quoc V. Le,et al. Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision , 2021, ICML.
[23] Jaemin Cho,et al. Unifying Vision-and-Language Tasks via Text Generation , 2021, ICML.
[24] Wonjae Kim,et al. ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision , 2021, ICML.
[25] Junyu Luo,et al. ChartOCR: Data Extraction from Charts Images via a Deep Hybrid Framework , 2021, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[26] Hua Wu,et al. UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning , 2020, ACL.
[27] Dennis Ulmer,et al. Trust Issues: Uncertainty Estimation Does Not Enable Reliable OOD Detection On Medical Tabular Data , 2020, ML4H@NeurIPS.
[28] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[29] Enamul Hoque,et al. Chart-to-Text: Generating Natural Language Descriptions for Charts by Adapting the Transformer Model , 2020, INLG.
[30] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[31] Thomas Muller,et al. TaPas: Weakly Supervised Table Parsing via Pre-training , 2020, ACL.
[32] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[33] Mitesh M. Khapra,et al. PlotQA: Reasoning over Scientific Plots , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[34] Furu Wei,et al. VL-BERT: Pre-training of Generic Visual-Linguistic Representations , 2019, ICLR.
[35] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[36] Diego Klabjan,et al. Data Extraction from Charts via Single Deep Neural Network , 2019, ArXiv.
[37] Niklas Elmqvist,et al. Visualizing for the Non‐Visual: Enabling the Visually Impaired to Use Visualization , 2019, Comput. Graph. Forum.
[38] Hei Law,et al. CornerNet: Detecting Objects as Paired Keypoints , 2018, International Journal of Computer Vision.
[39] Samira Ebrahimi Kahou,et al. FigureQA: An Annotated Figure Dataset for Visual Reasoning , 2017, ICLR.
[40] Percy Liang,et al. Compositional Semantic Parsing on Semi-Structured Tables , 2015, ACL.
[41] Jeffrey Heer,et al. ReVision: automated classification, analysis and redesign of chart images , 2011, UIST.
[42] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .