Pyramid Embedded Generative Adversarial Network for Automated Font Generation

In this paper, we investigate the Chinese font synthesis problem and propose a Pyramid Embedded Generative Adversarial Network (PEGAN) to automatically generate Chinese character images. The PEGAN consists of one generator and one discriminator. The generator is built using one encoder-decoder structure with cascaded refinement connections and mirror skip connections. The cascaded refinement connections embed a multiscale pyramid of down-sampled original input into the encoder feature maps of different layers, and multi-scale feature maps from the encoder are connected to the corresponding feature maps in the decoder to make the mirror skip connections. Through combining the generative adversarial loss, pixel-wise loss, category loss and perceptual loss, the generator and discriminator can be trained alternately to synthesize character images. In order to verify the effectiveness of our proposed PEGAN, we first build one evaluation set, in which the characters are selected according to their stroke number and frequency of use, and then use both qualitative and quantitative metrics to measure the performance of our model comparing with the baseline method. The experimental results demonstrate the effectiveness of our proposed model, it shows the potential to automatically extend small font banks into complete ones.

[1]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[2]  Wenyu Liu,et al.  Auto-Encoder Guided GAN for Chinese Calligraphy Synthesis , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[3]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[5]  A. Bovik,et al.  A universal image quality index , 2002, IEEE Signal Processing Letters.

[6]  Shinichiro Omachi,et al.  Automatic Generation of Typographic Font From Small Font Subset , 2017, IEEE Computer Graphics and Applications.

[7]  Jie Chang,et al.  Chinese Typography Transfer , 2017, ArXiv.

[8]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[11]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[12]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[16]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[17]  Chew Lim Tan,et al.  A model of stroke extraction from Chinese character images , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[18]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[19]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[21]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.