Adversarial Generation of Handwritten Text Images Conditioned on Sequences

We propose a system based on Generative Adversarial Networks (GAN) to produce synthetic images of handwritten words. We use bidirectional LSTM recurrent layers to get an embedding of the word to be rendered, and we feed it to the generator network. We also modify the standard GAN by adding an auxiliary network for text recognition. The system is then trained with a balanced combination of an adversarial loss and a CTC loss. Together, these extensions to GAN enable to control the textual content of the generated word images yielding realistic-looking images on both French and Arabic languages. State-of-the-art offline handwriting text recognition systems tend to use neural networks and therefore require a large amount of annotated data to be trained. In order to partially satisfy this requirement, we could use those synthetic images to increase the amount of training data. We show that integrating generated images into the existing training data of a text recognition system can slightly enhance its performance.

[1]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[2]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[3]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[4]  Qiong Zhang,et al.  Generating Handwritten Chinese Characters Using CycleGAN , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Hugo Larochelle,et al.  Modulating early visual processing by language , 2017, NIPS.

[7]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[8]  Théodore Bluche,et al.  Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[9]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[10]  Alejandro Héctor Toselli,et al.  ICDAR2017 Competition on Handwritten Text Recognition on the READ Dataset , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[11]  Jae Hyun Lim,et al.  Geometric GAN , 2017, ArXiv.

[12]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[13]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[14]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Alexander Mordvintsev,et al.  Inceptionism: Going Deeper into Neural Networks , 2015 .

[17]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[18]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[19]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[20]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[21]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[22]  Valentin Khrulkov,et al.  Geometry Score: A Method For Comparing Generative Adversarial Networks , 2018, ICML.

[23]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[25]  Haikal El Abed,et al.  ICDAR 2009 Handwriting Recognition Competition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[26]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).