Post-OCR parsing: building simple and robust parser via BIO tagging

Parsing textual information embedded in images is important for various downstream tasks. However, many previously developed parsers are limited to handling the information presented in one dimensional sequence format. Here, we present POST OCR TAGGING BASED PARSER (POT), a simple and robust parser that can parse visually embedded texts by BIO-tagging the output of optical character recognition (OCR) task. Our shallow parsing approach enables building robust neural parser with less than a thousand labeled data. POT is validated on receipt and namecard parsing tasks.

[1]  Seong Joon Oh,et al.  What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[3]  Dongyoon Han,et al.  Character Region Awareness for Text Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.