DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents
暂无分享,去创建一个
[1] Shiliang Pu,et al. TRIE++: Towards End-to-End Information Extraction from Visually Rich Documents , 2022, ArXiv.
[2] Yu-Gang Jiang,et al. SVTR: Scene Text Recognition with a Single Visual Model , 2022, IJCAI.
[3] Furu Wei,et al. LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking , 2022, ACM Multimedia.
[4] Vlad I. Morariu,et al. End-to-end Document Recognition and Understanding with Dessurt , 2022, ECCV Workshops.
[5] Trevor Darrell,et al. A ConvNet for the 2020s , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Dongyoon Han,et al. OCR-Free Document Understanding Transformer , 2021, ECCV.
[7] Dianhai Yu,et al. PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System , 2021, ArXiv.
[8] Sungrae Park,et al. BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents , 2021, AAAI.
[9] Shachar Klaiman,et al. DocReader: Bounding-Box Free Training of a Document Information Extraction Model , 2021, ICDAR.
[10] Zhanghui Kuang,et al. Spatial Dual-Modality Graph Reasoning for Key Information Extraction , 2021, ArXiv.
[11] Tomasz Dwojak,et al. Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer , 2021, ICDAR.
[12] Jiaxin Zhang,et al. Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution , 2021, AAAI.
[13] Cha Zhang,et al. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding , 2020, ACL.
[14] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[15] Aymen Shabou,et al. VisualWordGrid: Information Extraction From Scanned Documents Using A Multimodal Approach , 2020, ICDAR Workshops.
[16] Qiong Zhang,et al. Robust Layout-aware IE for Visually Rich Documents with Pre-trained Language Models , 2020, SIGIR.
[17] Lukasz Garncarek,et al. LAMBERT: Layout-Aware Language Modeling for Information Extraction , 2020, ICDAR.
[18] Furu Wei,et al. LayoutLM: Pre-training of Text and Layout for Document Image Understanding , 2019, KDD.
[19] Kai Chen,et al. Real-time Scene Text Detection with Differentiable Binarization , 2019, AAAI.
[20] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[21] Timo I. Denk,et al. BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding , 2019, ArXiv.
[22] Xiameng Qin,et al. EATEN: Entity-Aware Attention for Single Shot Visual Text Extraction , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[23] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[24] Dongyoon Han,et al. Character Region Awareness for Text Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Seong Joon Oh,et al. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[26] Xiaohui Zhao,et al. CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor , 2019, ArXiv.
[27] Xiaojing Liu,et al. Graph Convolution for Multimodal Information Extraction from Visually Rich Documents , 2019, NAACL.
[28] Steffen Bickel,et al. Chargrid: Towards Understanding 2D Documents , 2018, EMNLP.
[29] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[30] Ole Winther,et al. CloudScan - A Configuration-Free Invoice Analysis System Using Recurrent Neural Networks , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).
[31] Ming Zhou,et al. Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.
[32] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[33] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[34] Serge J. Belongie,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[37] Nachum Dershowitz,et al. OCR Error Correction Using Character Correction and Feature-Based Word Classification , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).
[38] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.
[39] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Vincent Poulain D'Andecy,et al. Field Extraction from Administrative Documents by Incremental Structural Templates , 2013, 2013 12th International Conference on Document Analysis and Recognition.
[41] Eric Medvet,et al. A probabilistic approach to printed document understanding , 2011, International Journal on Document Analysis and Recognition (IJDAR).
[42] Shlomo Argamon,et al. Building a test collection for complex document information processing , 2006, SIGIR.
[43] Francesca Cesarini,et al. Analysis and understanding of multi-class invoices , 2003, Document Analysis and Recognition.
[44] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[45] Stephen Lin,et al. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[46] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.