Unified Pretraining Framework for Document Understanding
暂无分享,去创建一个
Vlad I. Morariu | A. Nenkova | Jiuxiang Gu | Jason Kuen | Handong Zhao | R. Jain | Nikolaos Barmpalios | Tong Sun
[1] Hongfu Liu,et al. SelfDoc: Self-Supervised Document Representation Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Tomasz Dwojak,et al. Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer , 2021, ICDAR.
[3] Cha Zhang,et al. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding , 2020, ACL.
[4] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[5] Furu Wei,et al. LayoutLM: Pre-training of Text and Layout for Document Image Understanding , 2019, KDD.
[6] Seunghyun Park,et al. CORD: A Consolidated Receipt Dataset for Post-OCR Parsing , 2019 .
[7] Omri Ben-Eliezer,et al. READ: Recursive Autoencoders for Document Layout Generation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[8] Furu Wei,et al. VL-BERT: Pre-training of Generic Visual-Linguistic Representations , 2019, ICLR.
[9] Nan Duan,et al. Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training , 2019, AAAI.
[10] Antonio Jimeno-Yepes,et al. PubLayNet: Largest Dataset Ever for Document Layout Analysis , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[11] Iryna Gurevych,et al. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.
[12] Stefan Lee,et al. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.
[13] Greg Mori,et al. Similarity-Preserving Knowledge Distillation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[14] Jean-Philippe Thiran,et al. FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents , 2019, 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW).
[15] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[16] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[17] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.
[19] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[20] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.
[21] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[22] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.
[23] Ting Liu,et al. Recent advances in convolutional neural networks , 2015, Pattern Recognit..
[24] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.
[25] Konstantinos G. Derpanis,et al. Evaluation of deep convolutional nets for document image classification and retrieval , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).
[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[27] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[28] Anthony W. Kay,et al. Tesseract: an open-source optical character recognition engine , 2007 .
[29] Shlomo Argamon,et al. Building a test collection for complex document information processing , 2006, SIGIR.
[30] Jaime Carbonell,et al. Multi-Document Summarization By Sentence Extraction , 2000 .
[31] Shafiq R. Joty,et al. Self-Supervised Relationship Probing , 2020, NeurIPS.
[32] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.