Semi-supervised feature learning for improving writer identification

Data augmentation is typically used by supervised feature learning approaches for offline writer identification, but such approaches require a mass of additional training data and potentially lead to overfitting errors. In this study, a semi-supervised feature learning pipeline is proposed to improve the performance of writer identification by training with extra unlabeled data and the original labeled data simultaneously. Specifically, we propose a weighted label smoothing regularization (WLSR) method for data augmentation, which assigns a weighted uniform label distribution to the extra unlabeled data. The WLSR method regularizes the convolutional neural network (CNN) baseline to allow more discriminative features to be learned to represent the properties of different writing styles. The experimental results on well-known benchmark datasets (ICDAR2013 and CVL) showed that our proposed semi-supervised feature learning approach significantly improves the baseline measurement and perform competitively with existing writer identification approaches. Our findings provide new insights into offline writer identification.

[1]  Réjean Plamondon,et al.  Automatic signature verification and writer identification - the state of the art , 1989, Pattern Recognit..

[2]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[3]  Mohsen Ebrahimi Moghaddam,et al.  A text-independent Persian writer identification based on feature relation graph (FRG) , 2010, Pattern Recognit..

[4]  George Papandreou,et al.  Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[6]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[7]  Augustus Odena,et al.  Semi-Supervised Learning with Generative Adversarial Networks , 2016, ArXiv.

[8]  Lambert Schomaker,et al.  Writer identification using curvature-free features , 2017, Pattern Recognit..

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[11]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Lambert Schomaker,et al.  Junction detection in handwritten documents and its application to writer identification , 2015, Pattern Recognit..

[13]  Jian Sun,et al.  Convolutional neural networks at constrained time cost , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Lambert Schomaker,et al.  Writer identification using directional ink-trace width measurements , 2012, Pattern Recognit..

[15]  Marcus Liwicki,et al.  Sparse radial sampling LBP for writer identification , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[16]  Ling Shao,et al.  Single image super-resolution using multi-scale deep encoder-decoder with phase congruency edge map guidance , 2019, Inf. Sci..

[17]  Andreas K. Maier,et al.  Writer Identification Using GMM Supervectors and Exemplar-SVMs , 2017, Pattern Recognit..

[18]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Robert Sablatnig,et al.  Writer Identification and Retrieval Using a Convolutional Neural Network , 2015, CAIP.

[20]  Ohad Shamir,et al.  The Power of Depth for Feedforward Neural Networks , 2015, COLT.

[21]  Venu Govindaraju,et al.  Semi-supervised framework for writer identification using structural learning , 2013, IET Biom..

[22]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[23]  Maher Khemakhem,et al.  A model-based approach to offline text-independent Arabic writer identification and verification , 2015, Pattern Recognit..

[24]  Andreas K. Maier,et al.  Offline Writer Identification Using Convolutional Neural Network Activation Features , 2015, GCPR.

[25]  Andreas K. Maier,et al.  Unsupervised Feature Learning for Writer Identification and Writer Retrieval , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[26]  Yi Yang,et al.  Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Cheng Wu,et al.  Semi-Supervised and Unsupervised Extreme Learning Machines , 2014, IEEE Transactions on Cybernetics.

[29]  Lambert Schomaker,et al.  Text-Independent Writer Identification and Verification Using Textural and Allographic Features , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Lambert Schomaker,et al.  Automatic writer identification using connected-component contours and edge-based features of uppercase Western script , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Fei Yin,et al.  Handwritten Chinese text line segmentation by clustering with distance metric learning , 2009, Pattern Recognit..

[32]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33]  Venu Govindaraju,et al.  Structural Learning for Writer Identification in Offline Handwriting , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[34]  Fei Yin,et al.  Online and offline handwritten Chinese character recognition: Benchmarking on new databases , 2013, Pattern Recognit..

[35]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[36]  Imran Siddiqi,et al.  Writer identification using texture descriptors of handwritten fragments , 2016, Expert Syst. Appl..

[37]  Cordelia Schmid,et al.  Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach , 2016, International Journal of Computer Vision.

[38]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Sargur N. Srihari,et al.  A statistical approach to line segmentation in handwritten documents , 2007, Electronic Imaging.

[40]  Zhenyu He,et al.  Writer identification of Chinese handwriting documents using hidden Markov tree model , 2008, Pattern Recognit..

[41]  A. Papandreou,et al.  ICDAR 2013 Competition on Writer Identification , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[42]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[43]  Linjie Xing,et al.  DeepWriter: A Multi-stream Deep CNN for Text-Independent Writer Identification , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[44]  Gang Wang,et al.  A Siamese Long Short-Term Memory Architecture for Human Re-identification , 2016, ECCV.

[45]  Lewis D. Griffin,et al.  Writer identification using oriented Basic Image Features and the Delta encoding , 2014, Pattern Recognit..

[46]  K Karunakara,et al.  Writer Identification based on offline Handwritten Document Images in Kannada language using Empirical Mode Decomposition method , 2011 .

[47]  Tang Youbao,et al.  Text-Independent Writer Identification via CNN Features and Joint Bayesian , 2016 .

[48]  Hervé Jégou,et al.  Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening , 2012, ECCV.

[49]  Robert Sablatnig,et al.  CVL-DataBase: An Off-Line Database for Writer Retrieval, Writer Identification and Word Spotting , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[50]  Basilios Gatos,et al.  ICDAR 2011 Writer Identification Contest , 2011, 2011 International Conference on Document Analysis and Recognition.

[51]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[52]  Chiranjoy Chattopadhyay,et al.  Supervised framework for automatic recognition and retrieval of interaction: a framework for classification and retrieving videos with similar human interactions , 2016, IET Comput. Vis..

[53]  Elli Angelopoulou,et al.  Writer identification using VLAD encoded contour-Zernike moments , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[54]  Sargur N. Srihari,et al.  Semi-supervised Learning for Handwriting Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[55]  Lambert Schomaker,et al.  Delta-n Hinge: Rotation-Invariant Features for Writer Identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[56]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[57]  Elli Angelopoulou,et al.  Writer identification and verification using GMM supervectors , 2014, IEEE Winter Conference on Applications of Computer Vision.

[58]  Youbao Tang,et al.  Offline Text-Independent Writer Identification Based on Scale Invariant Feature Transform , 2014, IEEE Transactions on Information Forensics and Security.