Interactive layout analysis and transcription systems for historic handwritten documents

The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents, waiting to be classified and finally transcribed into a textual electronic format (such as ASCII or PDF). Nevertheless, most of the available fully-automatic applications addressing this task are far from being perfect and heavy and inefficient human intervention is often required to check and correct the results of such systems. In contrast, multimodal interactive-predictive approaches may allow the users to participate in the process helping the system to improve the overall performance. With this in mind, two sets of recent advances are introduced in this work: a novel interactive method for text block detection and two multimodal interactive handwritten text transcription systems which use active learning and interactive-predictive technologies in the recognition process.

[1]  M. Thomason Interactive Pattern Recognition , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Alfons Juan-Císcar,et al.  The GERMANA Database , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[3]  Yi-tzuu T. Chien Interactive pattern recognition , 1978 .

[4]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[5]  Alejandro Héctor Toselli,et al.  Using Mouse Feedback in Computer Assisted Transcription of Handwritten Text Images , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[6]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[7]  Ernest Valveny,et al.  Interactive-predictive detection of handwritten text blocks , 2010, Electronic Imaging.

[8]  Alfons Juan-Císcar,et al.  Balancing error and supervision effort in interactive-predictive handwriting recognition , 2010, IUI '10.

[9]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[10]  Alejandro Héctor Toselli,et al.  Multimodal interactive transcription of text images , 2010, Pattern Recognit..