Computer Assisted Transcription for Ancient Text Images

Paleography experts spend many hours transcribing ancient documents and state-of-the-art handwritten text recognition systems are not suitable for performing this task automatically. We propose here a new interactive, on-line framework which, rather than full automation, aims at assisting the experts in the proper recognition-transcription process; that is, facilitate and speed up the transcription of old documents. This framework combines the efficiency of automatic handwriting recognition systems with the accuracy of the experts, leading to a cost-effective perfect transcription of ancient manuscripts.

[1]  Francisco Casacuberta,et al.  Computer Assisted Transcription of Speech , 2007, IbPRIA.

[2]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Efstathios Stamatatos,et al.  Improving the quality of degraded document images , 2006, Second International Conference on Document Image Analysis for Libraries (DIAL'06).

[4]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[5]  Hermann Ney,et al.  Statistical Approaches to Computer-Assisted Translation , 2009, CL.

[6]  Fadoua Drira,et al.  Towards restoring historic documents degraded over time , 2006, Second International Conference on Document Image Analysis for Libraries (DIAL'06).

[7]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[8]  Mohamed S. Kamel,et al.  Image Analysis and Recognition , 2014, Lecture Notes in Computer Science.

[9]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[10]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[11]  Alejandro Héctor Toselli,et al.  Projection Profile Based Algorithm for Slant Removal , 2004, ICIAR.

[12]  Francisco Casacuberta,et al.  Finite-State Models for Computer Assisted Translation , 2004, ECAI.

[13]  Hermann Ney,et al.  Integrated Handwriting Recognition And Interpretation Using Finite-State Models , 2004, Int. J. Pattern Recognit. Artif. Intell..

[14]  Francisco Casacuberta,et al.  A Syntactic Pattern Recognition Approach to Computer Assisted Translation , 2004, SSPR/SPR.

[15]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..