Form-NLU: Dataset for the Form Natural Language Understanding
暂无分享,去创建一个
[1] Soyeon Caren Han,et al. Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis , 2022, COLING.
[2] Soyeon Caren Han,et al. Understanding Attention for Vision-and-Language Tasks , 2022, COLING.
[3] Soyeon Caren Han,et al. V-Doc : Visual questions answers with Documents , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Furu Wei,et al. LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking , 2022, ACM Multimedia.
[5] Liqing Zhang,et al. XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Zuyi Bao,et al. Entity Relation Extraction as Dependency Parsing in Visually Rich Documents , 2021, EMNLP.
[7] Stefan Feuerriegel,et al. DocParser: Hierarchical Document Structure Parsing from Renderings , 2021, AAAI.
[8] P. Biecek,et al. Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts , 2021, ICDAR.
[9] Furu Wei,et al. LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding , 2021, ArXiv.
[10] Wonjae Kim,et al. ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision , 2021, ICML.
[11] Kyosuke Nishida,et al. VisualMRC: Machine Reading Comprehension on Document Images , 2021, AAAI.
[12] Jiaxin Zhang,et al. Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution , 2021, AAAI.
[13] Cha Zhang,et al. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding , 2020, ACL.
[14] C. V. Jawahar,et al. DocVQA: A Dataset for VQA on Document Images , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[15] Furu Wei,et al. DocBank: A Benchmark Dataset for Document Layout Analysis , 2020, COLING.
[16] Lucian Popa,et al. Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[17] Furu Wei,et al. LayoutLM: Pre-training of Text and Layout for Document Image Understanding , 2019, KDD.
[18] Antonio Jimeno-Yepes,et al. Image-based table recognition: data, model, and evaluation , 2019, ECCV.
[19] Trevor Darrell,et al. Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Seunghyun Park,et al. CORD: A Consolidated Receipt Dataset for Post-OCR Parsing , 2019 .
[21] Mohit Bansal,et al. LXMERT: Learning Cross-Modality Encoder Representations from Transformers , 2019, EMNLP.
[22] Antonio Jimeno-Yepes,et al. PubLayNet: Largest Dataset Ever for Document Layout Analysis , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[23] Heyan Huang,et al. Complicated Table Structure Recognition , 2019, ArXiv.
[24] Cho-Jui Hsieh,et al. VisualBERT: A Simple and Performant Baseline for Vision and Language , 2019, ArXiv.
[25] Jean-Philippe Thiran,et al. FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).
[26] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[27] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[28] W. Ng,et al. The Informativeness of Substantial Shareholder Trading in the Lead Up to a Takeover Bid , 2016 .
[29] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.
[30] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[31] W. A. Martin,et al. Parsing , 1980, ACL.
[32] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[33] Furu Wei,et al. XFUND: A Benchmark Dataset for Multilingual Visually Rich Form Understanding , 2022, FINDINGS.