论文信息 - OmniPrint: A Configurable Printed Character Synthesizer

OmniPrint: A Configurable Printed Character Synthesizer

We introduce OmniPrint, a synthetic data generator of isolated printed characters, geared toward machine learning research. It draws inspiration from famous datasets such as MNIST, SVHN and Omniglot, but offers the capability of generating a wide variety of printed characters from various languages, fonts and styles, with customized distortions. We include 935 fonts from 27 scripts and many types of distortions. As a proof of concept, we show various use cases, including an example of meta-learning dataset designed for the upcoming MetaDL NeurIPS 2021 competition. OmniPrint is available at https://github.com/SunHaozhe/OmniPrint. Figure 1: Examples of characters generated by OmniPrint.

[1] Sebastian Ruder,et al. An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[2] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3] Muriel Visani,et al. An efficient parametrization of character degradation model for semi-synthetic image generation , 2013, HIP '13.

[4] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[5] Shijian Lu,et al. ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17) , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[6] Shijian Lu,et al. Spatial Fusion GAN for Image Synthesis , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Venu Govindaraju,et al. Equivalence of Different Methods for Slant and Skew Corrections in Word Recognition Applications , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[8] Tao Wang,et al. End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[9] Taesung Park,et al. CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[10] David S. Doermann,et al. Geometric Rectification of Camera-Captured Document Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11] Noman Islam,et al. A Survey on Optical Character Recognition System , 2017, ArXiv.

[12] Xiang Bai,et al. Script identification in the wild via discriminative convolutional neural network , 2016, Pattern Recognit..

[13] Jiri Matas,et al. E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text , 2018, ACCV Workshops.

[14] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[15] Yann LeCun,et al. Transformation Invariance in Pattern Recognition-Tangent Distance and Tangent Propagation , 1996, Neural Networks: Tricks of the Trade.

[16] Kate Saenko,et al. Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[17] Marwan Mattar,et al. Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[18] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[19] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[20] Manik Varma,et al. Character Recognition in Natural Images , 2009, VISAPP.

[21] Fuzhen Zhuang,et al. Deep Subdomain Adaptation Network for Image Classification , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[22] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[23] Muriel Visani,et al. Semi-synthetic Document Image Generation Using Texture Mapping on Scanned 3D Document Shapes , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[24] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[25] Victor S. Lempitsky,et al. Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[26] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[27] Jiashi Feng,et al. Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation , 2020, ICML.

[28] Robert M. Haralick,et al. Global and local document degradation models , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[29] Ning Xu,et al. Controllable Artistic Text Style Transfer via Shape-Matching GAN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30] Cong Yao,et al. UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World , 2020, CVPR 2020.

[31] Timnit Gebru,et al. Datasheets for datasets , 2018, Commun. ACM.

[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33] Seong Joon Oh,et al. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34] Other Contributors Are Indicated Where They Contribute. The FreeType Project , 2017 .

[35] Andy B. Yoo,et al. Approved for Public Release; Further Dissemination Unlimited X-ray Pulse Compression Using Strained Crystals X-ray Pulse Compression Using Strained Crystals , 2002 .

[36] Jun Huang,et al. SwapText: Image Based Texts Transfer in Scenes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Amos Storkey,et al. Meta-Learning in Neural Networks: A Survey , 2020, IEEE transactions on pattern analysis and machine intelligence.

[38] Shijian Lu,et al. Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes , 2018, ECCV.

[39] Geoffrey French,et al. Self-ensembling for visual domain adaptation , 2017, ICLR.

[40] Yiqiang Chen,et al. Transfer Learning with Dynamic Adversarial Adaptation Network , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[41] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[42] Wafa Khlif,et al. ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and Recognition — RRC-MLT-2019 , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[43] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[44] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Lianwen Jin,et al. Text Recognition in the Wild , 2020, ACM Comput. Surv..

[46] Yongdong Zhang,et al. A Fast Uyghur Text Detector for Complex Background Images , 2018, IEEE Transactions on Multimedia.

[47] Patrick Pérez,et al. Poisson image editing , 2003, ACM Trans. Graph..

[48] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[49] Artem Molchanov,et al. Generalized Inner Loop Meta-Learning , 2019, ArXiv.

[50] Ankush Gupta,et al. Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51] A. Harvey,et al. Skew detection in handwritten scripts , 1997, TENCON '97 Brisbane - Australia. Proceedings of IEEE TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications (Cat. No.97CH36162).

[52] Hugo Larochelle,et al. Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[53] Muriel Visani,et al. A character degradation model for grayscale ancient document images , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[54] Justin Gilmer,et al. MNIST-C: A Robustness Benchmark for Computer Vision , 2019, ArXiv.

[55] Emmanuelle Gouillart,et al. scikit-image: image processing in Python , 2014, PeerJ.

[56] L'eon Bottou,et al. Cold Case: The Lost MNIST Digits , 2019, NeurIPS.

[57] Pietro Perona,et al. Caltech-UCSD Birds 200 , 2010 .

[58] Shijian Lu,et al. GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[59] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[60] Alex Lamb,et al. Deep Learning for Classical Japanese Literature , 2018, ArXiv.

[61] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[62] Trevor Darrell,et al. Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[63] Trevor Darrell,et al. Adapting Visual Category Models to New Domains , 2010, ECCV.

[64] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.

[65] Gregory Cohen,et al. EMNIST: Extending MNIST to handwritten letters , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[66] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[67] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[68] Liang Wu,et al. Editing Text in the Wild , 2019, ACM Multimedia.

[69] Gabriela Csurka,et al. Domain Adaptation for Visual Applications: A Comprehensive Survey , 2017, ArXiv.