Interactive Off-Line Handwritten Text Transcription Using On-Line Handwritten Text as Feedback

Handwritten Text Recognition is a problem that has gained attention in the last years mainly due to the interest in the transcription of historical documents. However, the automatic transcription is ineffectual in unconstrained handwritten documents. Thus, human intervention is typically needed to correct the results. Given that a post-editing approach is inefficient and uncomfortable, multimodal interactive approaches have begun to emerge in the last years. In this scheme, the user interacts with the system by means of an e-pen. This multimodal feedback, on the one hand, allows to improve the accuracy of the system and, on the other hand, increases user acceptability. In this work, we present a new approach on interaction based on character sequences. Here we present developments that allow taking advantage of interaction-derived context to significantly improve feedback decoding accuracy. Empirical tests suggest that, despite the loss of the deterministic accuracy of traditional peripherals, this approach can save significant amounts of user effort with respect to non-interactive post-editing correction.

[1]  Hermann Ney,et al.  Integrated Handwriting Recognition And Interpretation Using Finite-State Models , 2004, Int. J. Pattern Recognit. Artif. Intell..

[2]  Alejandro Héctor Toselli,et al.  Multimodal Computer-Assisted transcription of Text Images at Character-Level Interaction , 2012, Int. J. Pattern Recognit. Artif. Intell..

[3]  Alejandro Héctor Toselli,et al.  Character-Level Interaction in Computer-Assisted Transcription of Text Images , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[4]  James Glass,et al.  Modelling out-of-vocabulary words for robust speech recognition , 2002 .

[5]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[6]  Alejandro Héctor Toselli,et al.  Multimodal interactive transcription of text images , 2010, Pattern Recognit..

[7]  Francisco Casacuberta,et al.  Interactive Pattern Recognition , 2007, MLMI.

[8]  Horst Bunke,et al.  A full English sentence database for off-line handwriting recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[9]  Geoffrey Leech,et al.  The tagged LOB Corpus : user's manual , 1986 .

[10]  Francisco Casacuberta,et al.  A Syntactic Pattern Recognition Approach to Computer Assisted Translation , 2004, SSPR/SPR.

[11]  Alejandro Héctor Toselli,et al.  On-Line Handwriting Recognition System for Tamil Handwritten Characters , 2007, IbPRIA.

[12]  James R. Glass,et al.  Modeling out-of-vocabulary words for robust speech recognition , 2000, INTERSPEECH.

[13]  M. Tahar Kechadi,et al.  Preprocessing Techniques for Online Handwriting Recognition , 2009, Intelligent Text Categorization and Clustering.

[14]  Alejandro Héctor Toselli,et al.  Computer Assisted Transcription of Handwritten Text Images , 2007 .

[15]  Isabelle Guyon,et al.  UNIPEN project of on-line data exchange and recognizer benchmarks , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).