Regular expressions for decoding of neural network outputs

This article proposes a convenient tool for decoding the output of neural networks trained by Connectionist Temporal Classification (CTC) for handwritten text recognition. We use regular expressions to describe the complex structures expected in the writing. The corresponding finite automata are employed to build a decoder. We analyze theoretically which calculations are relevant and which can be avoided. A great speed-up results from an approximation. We conclude that the approximation most likely fails if the regular expression does not match the ground truth which is not harmful for many applications since the low probability will be even underestimated. The proposed decoder is very efficient compared to other decoding methods. The variety of applications reaches from information retrieval to full text recognition. We refer to applications where we integrated the proposed decoder successfully.

[1]  Ken Thompson,et al.  Programming Techniques: Regular expression search algorithm , 1968, Commun. ACM.

[2]  Pierre Dupont,et al.  Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms , 2005, Pattern Recognit..

[3]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[4]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[5]  Hermann Ney,et al.  Framewise and CTC training of Neural Networks for handwriting recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[6]  Roger Labahn,et al.  CITlab ARGUS for historical data tables , 2014, ArXiv.

[7]  Christopher Kermorvant,et al.  The A2iA Multi-lingual Text Recognition System at the Second Maurdor Evaluation , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[8]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[9]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[10]  Anders Krogh,et al.  Chapter 4 - An introduction to hidden Markov models for biological sequences , 1998 .

[11]  Tobias Grüning,et al.  CITlab ARGUS for historical handwritten documents , 2016, ArXiv.

[12]  Alejandro Héctor Toselli,et al.  ICDAR 2015 competition HTRtS: Handwritten Text Recognition on the tranScriptorium dataset , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[13]  A. R. Meyer,et al.  Economy of Description by Automata, Grammars, and Formal Systems , 1971, SWAT.

[14]  Bruce W. Watson,et al.  Incremental construction of minimal acyclic finite state automata , 2000, CL.

[15]  Yousri Kessentini,et al.  Word Spotting and Regular Expression Detection in Handwritten Documents , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[16]  Mehryar Mohri,et al.  Speech Recognition with Weighted Finite-State Transducers , 2008 .

[17]  Jeffrey E. F. Friedl Mastering Regular Expressions , 1997 .

[18]  Alejandro Héctor Toselli,et al.  ICFHR2014 Competition on Handwritten Text Recognition on Transcriptorium Datasets (HTRtS) , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[19]  Steve Young,et al.  Token passing: a simple conceptual model for connected speech recognition systems , 1989 .

[20]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[21]  DaciukJan,et al.  Incremental construction of minimal acyclic finite-state automata , 2000 .

[22]  Johan Schalkwyk,et al.  Learning acoustic frame labeling for speech recognition with recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Clément Chatelain,et al.  Spotting handwritten words and REGEX using a two stage BLSTM-HMM architecture , 2015, Electronic Imaging.