Data-Efficient Information Extraction from Documents with Pre-trained Language Models
暂无分享,去创建一个
Christophe Garcia | Stefan Duffner | Véronique Eglin | Haytham Elghazel | Alex Aussem | Jérémy Espinas | Clément Sage | Thibault Douzon | Christophe Garcia | S. Duffner | A. Aussem | V. Eglin | H. Elghazel | Thibault Douzon | Jérémy Espinas | Clément Sage
[1] Zheng Huang,et al. ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[2] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[3] Francesca Cesarini,et al. INFORMys: A Flexible Invoice-Like Form-Reader System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[4] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[5] Furu Wei,et al. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding , 2020, ACL.
[6] Xiaohui Zhao,et al. CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor , 2019, ArXiv.
[7] Ole Winther,et al. Attend, Copy, Parse End-to-end Information Extraction from Documents , 2018, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[8] Qiong Zhang,et al. Robust Layout-aware IE for Visually Rich Documents with Pre-trained Language Models , 2020, SIGIR.
[9] Shashank Mujumdar,et al. Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning , 2020, ArXiv.
[10] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[11] Ping Gong,et al. PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks , 2020, ArXiv.
[12] Konstantinos G. Derpanis,et al. Evaluation of deep convolutional nets for document image classification and retrieval , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).
[13] Frederick Reiss,et al. Rule-Based Information Extraction is Dead! Long Live Rule-Based Information Extraction Systems! , 2013, EMNLP.
[14] Furu Wei,et al. LayoutLM: Pre-training of Text and Layout for Document Image Understanding , 2019, KDD.
[15] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[16] Christian Reisswig,et al. BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding , 2019, ArXiv.
[17] Mitchell P. Marcus,et al. Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.
[18] Omer Levy,et al. Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.
[19] Shlomo Argamon,et al. Building a test collection for complex document information processing , 2006, SIGIR.
[20] Véronique Eglin,et al. End-to-End Extraction of Structured Information from Business Documents with Pointer-Generator Networks , 2020, SPNLP.
[21] BROS: A PRE-TRAINED LANGUAGE MODEL , 2020 .
[22] Tuan Anh Nguyen Dang,et al. End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net , 2019, BMVC.
[23] Jonathan Berant,et al. Question Answering is a Format; When is it Useful? , 2019, ArXiv.
[24] Sriram Raghavan,et al. Regular Expression Learning for Information Extraction , 2008, EMNLP.
[25] Xipeng Qiu,et al. Pre-trained models for natural language processing: A survey , 2020, Science China Technological Sciences.
[26] David Yarowsky,et al. Techniques in Speech Acoustics , 1999, Computational Linguistics.
[27] Hamid Motahari,et al. A Report on the First Workshop on Document Intelligence (DI) at NeurIPS 2019 , 2020, SIGKDD Explor..
[28] Ole Winther,et al. CloudScan - A Configuration-Free Invoice Analysis System Using Recurrent Neural Networks , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).
[29] C. V. Jawahar,et al. DocVQA: A Dataset for VQA on Document Images , 2021, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[30] Roy Shilkrot,et al. Cardinal Graph Convolution Framework for Document Information Extraction , 2020, DocEng.
[31] Kilian Q. Weinberger,et al. Revisiting Few-sample BERT Fine-tuning , 2020, ArXiv.
[32] Jean-Philippe Thiran,et al. FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).
[33] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[34] Zhiyu Chen,et al. Few-shot NLG with Pre-trained Language Model , 2020, ACL.
[35] Seunghyun Park,et al. CORD: A Consolidated Receipt Dataset for Post-OCR Parsing , 2019 .
[36] Regina Barzilay,et al. GraphIE: A Graph-Based Framework for Information Extraction , 2018, NAACL.
[37] Véronique Eglin,et al. Recurrent Neural Network Approach for Table Field Extraction in Business Documents , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[38] Steffen Bickel,et al. Chargrid: Towards Understanding 2D Documents , 2018, EMNLP.
[39] Xiaojing Liu,et al. Graph Convolution for Multimodal Information Extraction from Visually Rich Documents , 2019, NAACL.