The A2iA Multi-lingual Text Recognition System at the Second Maurdor Evaluation

This paper describes the system submitted by A2iA to the second Maurdor evaluation for multi-lingual text recognition. A system based on recurrent neural networks and weighted finite state transducers was used both for printed and handwritten recognition, in French, English and Arabic. To cope with the difficulty of the documents, multiple text line segmentations were considered. An automatic procedure was used to prepare annotated text lines needed for the training of the neural network. Language models were used to decode sequences of characters or words for French and English and also sequences of part-of-arabic words (PAWs) in case of Arabic. This system scored first at the second Maurdor evaluation for both printed and handwritten text recognition in French, English and Arabic.

[1]  Volker Märgner,et al.  NIST 2013 Open Handwriting Recognition and Translation (Open HaRT'13) Evaluation , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[2]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[3]  Christopher Kermorvant,et al.  Curriculum Learning for Handwritten Text Line Recognition , 2013, 2014 11th IAPR International Workshop on Document Analysis Systems.

[4]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[5]  Christopher Kermorvant,et al.  The A2iA French handwriting recognition system at the Rimes-ICDAR2011 competition , 2012, Electronic Imaging.

[6]  Bruno Grilhères,et al.  The Maurdor Project: Improving Automatic Processing of Digital Documents , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[7]  Haikal El Abed,et al.  ICDAR 2009 Handwriting Recognition Competition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[8]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[9]  Hermann Ney,et al.  Multilingual Off-Line Handwriting Recognition in Real-World Images , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[10]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[11]  Nicolas Ragot,et al.  Combining Structure and Parameter Adaptation of HMMs for Printed Text Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Christopher Kermorvant,et al.  Automatic Line Segmentation and Ground-Truth Alignment of Handwritten Documents , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[13]  Christopher Kermorvant,et al.  The A2iA Arabic Handwritten Text Recognition System at the Open HaRT2013 Evaluation , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[14]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[15]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[16]  Adel M. Alimi,et al.  A New Arabic Printed Text Image Database and Evaluation Protocols , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[17]  Olivier Galibert,et al.  First maurdor 2013 evaluation campaign in scanned document image processing , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Alan P. Parkes Finite State Transducers , 2008 .

[19]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[20]  Christopher Kermorvant,et al.  Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[21]  Haikal El Abed,et al.  ICDAR 2011 - French Handwriting Recognition Competition , 2011, 2011 International Conference on Document Analysis and Recognition.

[22]  Volker Märgner,et al.  ICDAR 2011 - Arabic Handwriting Recognition Competition , 2011, ICDAR.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[25]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.