Multimodal Interactive Transcription of Ancient Text Images

The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. On one hand, the vast majority of these documents remain waiting to be transcribed into a textual electronic format (such as ASCII or PDF) that would provide historians and other researchers new ways of indexing, consulting and querying these documents. On the other hand, in some cases, adequate transcriptions of the handwritten text images are already available. This drives an increasing need to align images and transcriptions in order to make it more comfortable the consulting of these documents. In this work two systems are presented to deal with these issues. The first one aims at transcribing these documents using a interactive-predictive approach, which integrates user corrective-feedback actions in the proper recognition process. The second one presents an alignment method based on the Viterbi algorithm to find mappings between word images of a given handwritten document and their respective (ASCII) words on its given transcription.

[1]  Hermann Ney,et al.  Integrated Handwriting Recognition And Interpretation Using Finite-State Models , 2004, Int. J. Pattern Recognit. Artif. Intell..

[2]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[3]  Alejandro Héctor Toselli,et al.  Multimodal interactive transcription of text images , 2010, Pattern Recognit..

[4]  Alejandro Héctor Toselli,et al.  Computer Assisted Transcription of Text Images and Multimodal Interaction , 2008, MLMI.

[5]  Alejandro Héctor Toselli,et al.  Using Mouse Feedback in Computer Assisted Transcription of Handwritten Text Images , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[6]  Alejandro Héctor Toselli,et al.  Viterbi Based Alignment between Text Images and their Transcripts , 2007, LaTeCH@ACL 2007.

[7]  Alejandro Héctor Toselli,et al.  Interactive multimodal transcription of text images using a web-based demo system , 2009, IUI.

[8]  Mohamed S. Kamel,et al.  Image Analysis and Recognition , 2014, Lecture Notes in Computer Science.

[9]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[10]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[11]  Jean-Cédric Chappelier,et al.  Offline grammar-based recognition of handwritten sentences , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Nikos Fakotakis,et al.  An unconstrained handwriting recognition system , 2002, International Journal on Document Analysis and Recognition.

[13]  Alejandro Héctor Toselli,et al.  Computer Assisted Transcription for Ancient Text Images , 2007, ICIAR.

[14]  Andrei Popescu-Belis,et al.  Machine Learning for Multimodal Interaction , 4th International Workshop, MLMI 2007, Brno, Czech Republic, June 28-30, 2007, Revised Selected Papers , 2008, MLMI.

[15]  Horst Bunke,et al.  Automatic segmentation of the IAM off-line database for handwritten English text , 2002, Object recognition supported by user interaction for service robots.

[16]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Efstathios Stamatatos,et al.  Improving the quality of degraded document images , 2006, Second International Conference on Document Image Analysis for Libraries (DIAL'06).

[18]  Venu Govindaraju,et al.  Offline Arabic handwriting recognition: a survey , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.