Semi-supervised framework for writer identification using structural learning

Writer identification is a complex task as the handwriting of an individual encapsulates lot of information pertaining to text and personality of a writer. To learn a model to distinguish one writer from the other, it is important to capture every nuance of the handwriting of an individual. Learning such model poses two challenges. First, discriminatory variables maybe large and potentially related leading to a complex discriminatory function. Second, it will require large amount of training data to learn a complex and possibly high-dimensional function. In this study, the authors are proposing a semi-supervised framework for writer identification for offline handwritten documents that leverages the information hidden in the unlabelled samples. Proposed framework models the complexity of approximating the optimal hypothesis by breaking the main task into several subtasks and learning a separate hypothesis for each subtask. All the hypotheses pertaining to the subtasks will be used for the best model selection by retrieving a common substructure that has high correspondence with all the candidate hypotheses. The obtained substructure acts as a knowledge base that has the contextual information, which is otherwise difficult to retrieve. The extra information can be used to improve the performance of the identification model.

[1]  Sargur N. Srihari,et al.  A statistical model for writer verification , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[2]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[3]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[4]  Venu Govindaraju,et al.  An Oracle-based co-training framework for writer identification in offline handwriting , 2011, Electronic Imaging.

[5]  Vassilis Anastassopoulos,et al.  Morphological waveform coding for writer identification , 2000, Pattern Recognit..

[6]  Venkatesan Guruswami,et al.  Multiclass learning, boosting, and error-correcting codes , 1999, COLT '99.

[7]  Tieniu Tan,et al.  Personal identification based on handwriting , 2000, Pattern Recognit..

[8]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[9]  P. Thumwarin,et al.  On-line writer identification method based on FIR system characterizing pen-tip movement , 2008, 2008 International Conference on Signals and Electronic Systems.

[10]  Réjean Plamondon,et al.  Automatic signature verification and writer identification - the state of the art , 1989, Pattern Recognit..

[11]  Venu Govindaraju,et al.  Structural Learning for Writer Identification in Offline Handwriting , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[12]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[13]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[14]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[15]  Ming-Yen Tsai,et al.  Online Writer Identification Using The Point Distribution Model , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[16]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[17]  Tieniu Tan,et al.  Online Text-independent Writer Identification Based on Temporal Sequence and Shape Codes , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[18]  Venu Govindaraju,et al.  Latent Dirichlet allocation based writer identification in offline handwriting , 2010, DAS '10.

[19]  Louis Vuurpijl,et al.  Writer identification using edge-based directional features , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[20]  Eric O. Postma,et al.  Improving automatic writer identification , 2005, BNAIC.

[21]  Lambert Schomaker,et al.  Text-Independent Writer Identification and Verification Using Textural and Allographic Features , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.