CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
暂无分享,去创建一个
Aviad Aberdam | Roy Ganz | Shai Mazor | Ron Litman | Oren Nuriel | Alona Golts | David Bensaid | Royee Tichauer
[1] Ashish V. Thapliyal,et al. PaLI: A Jointly-Scaled Multilingual Language-Image Model , 2022, ICLR.
[2] Ali Furkan Biten,et al. Out-of-Vocabulary Challenge Report , 2022, ECCV Workshops.
[3] Oron Anschel,et al. GLASS: Global to Local Attention for Scene-Text Spotting , 2022, ECCV.
[4] Rowel Atienza,et al. Scene Text Recognition with Permuted Autoregressive Sequence Models , 2022, ECCV.
[5] Ruiyu Li,et al. Context-Based Contrastive Learning for Scene Text Recognition , 2022, AAAI.
[6] Hao Liu,et al. Perceiving Stroke-Semantic Context: Hierarchical Contrastive Learning for Robust Scene Text Recognition , 2022, AAAI.
[7] Errui Ding,et al. MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining , 2022, ArXiv.
[8] Zhe Gan,et al. GIT: A Generative Image-to-text Transformer for Vision and Language , 2022, Trans. Mach. Learn. Res..
[9] Thomas Kipf,et al. Simple Open-Vocabulary Object Detection with Vision Transformers , 2022, ArXiv.
[10] Aviad Aberdam,et al. Multimodal Semi-Supervised Learning for Text Recognition , 2022, ArXiv.
[11] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, NeurIPS.
[12] Peng Wang,et al. Pushing the Performance Limit of Scene Text Recognizer without Human Annotation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[13] A. Bissacco,et al. Towards End-to-End Unified Scene Text Detection and Layout Analysis , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Jingdong Chen,et al. SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Dahua Lin,et al. SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Philip H. S. Torr,et al. Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting , 2022, ECCV.
[17] P. Perona,et al. Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Jingren Zhou,et al. OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework , 2022, ICML.
[19] S. Hoi,et al. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation , 2022, ICML.
[20] Srikar Appalaraju,et al. LaTr: Layout-Aware Transformer for Scene-Text VQA , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Sungrae Park,et al. Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features , 2021, ECCV.
[22] Xiaowei Hu,et al. Scaling Up Vision-Language Pretraining for Image Captioning , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Ross B. Girshick,et al. Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Aleksandr Drozd,et al. Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics , 2021, INSIGHTS.
[25] Adams Wei Yu,et al. SimVLM: Simple Visual Language Model Pretraining with Weak Supervision , 2021, ICLR.
[26] Yongdong Zhang,et al. From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[27] Ayan Kumar Bhunia,et al. Towards the Unseen: Iterative Text Recognition by Distilling from Errors , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[28] Junnan Li,et al. Align before Fuse: Vision and Language Representation Learning with Momentum Distillation , 2021, NeurIPS.
[29] Bhargava Urala Kota,et al. DocFormer: End-to-End Transformer for Document Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Rowel Atienza,et al. Vision Transformer for Fast and Efficient Scene Text Recognition , 2021, ICDAR.
[31] Tal Hassner,et al. TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Chunhua Shen,et al. ABCNet v2: Adaptive Bezier-Curve Network for Real-Time End-to-End Text Spotting , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[33] Julien Mairal,et al. Emerging Properties in Self-Supervised Vision Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[34] Yongdong Zhang,et al. Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Kiyoharu Aizawa,et al. What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[37] Lei Zhang,et al. VinVL: Making Visual Representations Matter in Vision-Language Models , 2021, ArXiv.
[38] R. Manmatha,et al. On Calibration of Scene-Text Recognition Models , 2020, ECCV Workshops.
[39] Pietro Perona,et al. Sequence-to-Sequence Contrastive Learning for Text Recognition , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Shiliang Pu,et al. MANGO: A Mask Attention Guided One-Stage Scene Text Spotter , 2020, AAAI.
[41] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[42] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.
[43] Jing Huang,et al. Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting , 2020, ECCV.
[44] Weiping Wang,et al. SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Jiebo Luo,et al. On Vocabulary Reliance in Scene Text Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Errui Ding,et al. Towards Accurate Scene Text Recognition With Semantic Reasoning Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] R. Manmatha,et al. SCATTER: Selective Context Attentional Scene Text Recognizer , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Hamid Reza Vaezi Joze,et al. MMTM: Multimodal Transfer Module for CNN Fusion , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Wei Liu,et al. Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[50] Lianwen Jin,et al. ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text - RRC-ArT , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[51] Kai Zhou,et al. ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[52] Cho-Jui Hsieh,et al. VisualBERT: A Simple and Performant Baseline for Vision and Language , 2019, ArXiv.
[53] Wafa Khlif,et al. ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition — RRC-MLT-2019 , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).
[54] Seong Joon Oh,et al. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[55] Shijian Lu,et al. ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17) , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).
[56] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[57] A. Vedaldi,et al. Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Jiri Matas,et al. COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images , 2016, ArXiv.
[59] Ernest Valveny,et al. ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).
[60] Palaiahnakote Shivakumara,et al. A robust arbitrary text detection system for natural scene images , 2014, Expert Syst. Appl..
[61] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.
[62] Jon Almazán,et al. ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.
[63] Kai Wang,et al. End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.
[64] C. V. Jawahar,et al. Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.
[65] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.
[66] Sharon Fogel,et al. TextAdaIN: Fine-Grained AdaIN for Robust Text Recognition , 2021, ArXiv.
[67] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.