Computer Assisted Transcription of Text Images: Results on the GERMANA Corpus and Analysis of Improvements Needed for Practical Use

We present a study of the application of Computer Assisted Transcription of Text Images (CATTI) to a task which is much closer to real applications than other tasks previously studied. The new task consists in the transcription of a new publicly available historic handwritten document, called GERMANA. A detailed analysis of the main factors in???uencing the system performance are exposed and some strategies to circumvent them are proposed.

[1]  Hermann Ney,et al.  Integrated Handwriting Recognition And Interpretation Using Finite-State Models , 2004, Int. J. Pattern Recognit. Artif. Intell..

[2]  Alejandro Héctor Toselli,et al.  Multimodal interactive transcription of text images , 2010, Pattern Recognit..

[3]  Alejandro Héctor Toselli,et al.  Using Mouse Feedback in Computer Assisted Transcription of Handwritten Text Images , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[4]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[5]  Alfons Juan-Císcar,et al.  The GERMANA Database , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[6]  Alejandro Héctor Toselli,et al.  Computer Assisted Transcription of Handwritten Text Images , 2007 .