LayoutLM: Pre-training of Text and Layout for Document Image Understanding
暂无分享,去创建一个
Furu Wei | Ming Zhou | Yiheng Xu | Lei Cui | Minghao Li | Shaohan Huang
[1] Marcus Liwicki,et al. Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).
[2] Matheus Palhares Viana,et al. Fast CNN-Based Document Layout Analysis , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Giovanni Soda,et al. Artificial neural networks for document analysis and recognition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[6] Shinjae Yoo,et al. Visual Detection with Context for Document Layout Analysis , 2019, EMNLP.
[7] Robert M. Haralick,et al. Document page decomposition by the bounding-box project , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.
[8] H. Emptoz,et al. A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.
[9] Ujjwal Bhattacharya,et al. Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).
[10] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[11] Arnab Nandi,et al. Deterministic Routing between Layout Abstractions for Multi-Scale Classification of Visually Rich Documents , 2019, IJCAI.
[12] A. Peter Johnson,et al. A Fast Algorithm for Bottom-Up Document Layout Analysis , 1997, IEEE Trans. Pattern Anal. Mach. Intell..
[13] Shlomo Argamon,et al. Building a test collection for complex document information processing , 2006, SIGIR.
[14] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[15] Lawrence O'Gorman,et al. The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..
[16] Steffen Bickel,et al. Chargrid: Towards Understanding 2D Documents , 2018, EMNLP.
[17] Andreas Dengel,et al. DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).
[18] Rolf Ingold,et al. Evaluation of SVM, MLP and GMM Classifiers for Layout Analysis of Historical Documents , 2013, 2013 12th International Conference on Document Analysis and Recognition.
[19] Mohammad Mehdi Rashidi,et al. Modular Multimodal Architecture for Document Classification , 2019, ArXiv.
[20] Antonio Jimeno-Yepes,et al. PubLayNet: Largest Dataset Ever for Document Layout Analysis , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[21] Konstantinos G. Derpanis,et al. Evaluation of deep convolutional nets for document image classification and retrieval , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).
[22] Robert M. Haralick,et al. Recursive X-Y cut using bounding boxes of connected components , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.
[23] Ersin Yumer,et al. Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Paul A. Viola,et al. Learning nongenerative grammatical models for document analysis , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[25] Michael S. Bernstein,et al. Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.
[26] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[27] Zhi Tang,et al. A Table Detection Method for PDF Documents Based on Convolutional Neural Networks , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).
[28] Xiaojing Liu,et al. Graph Convolution for Multimodal Information Extraction from Visually Rich Documents , 2019, NAACL.
[29] Jean-Philippe Thiran,et al. FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).
[30] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.