Beyond Document Page Classification: Design, Datasets, and Challenges
暂无分享,去创建一个
[1] Kevin Leach,et al. On Evaluation of Document Classification using RVL-CDIP , 2023, ArXiv.
[2] Matthew B. Blaschko,et al. Document Understanding Dataset and Evaluation (DUDE) , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[3] M. Turski,et al. CCpdf: Building a High Quality Corpus for Visually Rich Documents from Web Crawl Data , 2023, ICDAR.
[4] Mickaël Coustaty,et al. DocILE Benchmark for Document Information Localization and Extraction , 2023, ICDAR.
[5] Ernest Valveny,et al. Hierarchical multimodal transformers for Multi-Page DocVQA , 2022, Pattern Recognit..
[6] Mohit Bansal,et al. Unifying Vision, Text, and Layout for Universal Document Processing , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Lukas Klein,et al. A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification , 2022, ICLR.
[8] Kevin Leach,et al. Evaluating Out-of-Distribution Performance on Document Image Classifiers , 2022, NeurIPS.
[9] Julian Martin Eisenschlos,et al. Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding , 2022, ICML.
[10] Nisarg Mehta,et al. Tab this folder of documents: page stream segmentation of business documents , 2022, ACM Symposium on Document Engineering.
[11] Kok Wei Chee,et al. Augraphy: A Data Augmentation Library for Document Images , 2022, ICDAR.
[12] Fuli Feng,et al. Towards Complex Document Understanding By Discrete Reasoning , 2022, ACM Multimedia.
[13] Stepán Simsa,et al. Business Document Information Extraction: Towards Practical Benchmarks , 2022, CLEF.
[14] B. Pfitzmann,et al. DocLayNet: A Large Human-Annotated Dataset for Document-Layout Segmentation , 2022, KDD.
[15] Mickaël Coustaty,et al. VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification , 2022, Pattern Recognit..
[16] Vlad I. Morariu,et al. Unified Pretraining Framework for Document Understanding , 2022, NeurIPS.
[17] Furu Wei,et al. LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking , 2022, ACM Multimedia.
[18] Furu Wei,et al. DiT: Self-supervised Pre-training for Document Image Transformer , 2022, ACM Multimedia.
[19] Ali Furkan Biten,et al. OCR-IDL: OCR Annotations for Industry Document Library Dataset , 2022, ECCV Workshops.
[20] David Sánchez,et al. The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization , 2022, Computational Linguistics.
[21] Umapada Pal,et al. DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis , 2021, ICDAR.
[22] Bhargava Urala Kota,et al. DocFormer: End-to-End Transformer for Document Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[23] Florian Matthes,et al. Anonymization of german legal court rulings , 2021, ICAIL.
[24] Constantin Spille,et al. Key Information Extraction From Documents: Evaluation And Generator , 2021, DeepOntoNLP/X-SENTIMENT@ESWC.
[25] Hongfu Liu,et al. SelfDoc: Self-Supervised Document Representation Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] P. Biecek,et al. Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts , 2021, ICDAR.
[27] Ernest Valveny,et al. InfographicVQA , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
[28] Tomasz Dwojak,et al. Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer , 2021, ICDAR.
[29] Seong Joon Oh,et al. Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Shashank Mujumdar,et al. Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning , 2020, ArXiv.
[31] C. V. Jawahar,et al. Document Visual Question Answering Challenge 2020 , 2020, ArXiv.
[32] Xiaohua Zhai,et al. Are we done with ImageNet? , 2020, ArXiv.
[33] Furu Wei,et al. DocBank: A Benchmark Dataset for Document Layout Analysis , 2020, COLING.
[34] Lucian Popa,et al. Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).
[35] Antonio Jimeno-Yepes,et al. Image-based table recognition: data, model, and evaluation , 2019, ECCV.
[36] Gerhard Heyer,et al. Multi-modal page stream segmentation with convolutional neural networks , 2019, Lang. Resour. Evaluation.
[37] Zheng Huang,et al. ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[38] Antonio Jimeno-Yepes,et al. PubLayNet: Largest Dataset Ever for Document Layout Analysis , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[39] Jean-Philippe Thiran,et al. FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).
[40] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.
[41] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[42] Ran El-Yaniv,et al. Selective Classification for Deep Neural Networks , 2017, NIPS.
[43] Siddharth Garimella,et al. Identification of Receipts in a Multi-receipt Image using Spectral Clustering , 2016 .
[44] Ignazio Gallo,et al. Deep Neural Networks for Page Stream Segmentation and Classification , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).
[45] Arnaud Chevallier,et al. Strategic Thinking in Complex Problem Solving , 2016 .
[46] Konstantinos G. Derpanis,et al. Evaluation of deep convolutional nets for document image classification and retrieval , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).
[47] Milos Hauskrecht,et al. Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.
[48] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[49] Jayant Kumar,et al. Structural similarity for document image classification and retrieval , 2014, Pattern Recognit. Lett..
[50] David S. Doermann,et al. Unsupervised Classification of Structurally Similar Document Images , 2013, 2013 12th International Conference on Document Analysis and Recognition.
[51] Albert Gordo,et al. Document Classification and Page Stream Segmentation for Digital Mailroom Applications , 2013, 2013 12th International Conference on Document Analysis and Recognition.
[52] Albert Gordo,et al. A Bag-of-Pages Approach to Unordered Multi-page Document Classification , 2010, 2010 20th International Conference on Pattern Recognition.
[53] D. Doermann,et al. Automatic Document Logo Detection , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).
[54] Shlomo Argamon,et al. Building a test collection for complex document information processing , 2006, SIGIR.
[55] Rich Caruana,et al. Predicting good probabilities with supervised learning , 2005, ICML.
[56] George R Thoma,et al. Image informatics at a national research center. , 2005, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society.
[57] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[58] Jordy Van Landeghem,et al. ICDAR 2023 Competition on Document UnderstanDing of Everything (DUDE) , 2023, IEEE International Conference on Document Analysis and Recognition.
[59] Seunghyun Park,et al. Donut: Document Understanding Transformer without OCR , 2021, ArXiv.
[60] Christophe Garcia,et al. Data-Efficient Information Extraction from Documents with Pre-trained Language Models , 2021, ICDAR Workshops.