A contextual processing for an OCR system, based on pattern learning

The authors propose a strong contextual stage, which can be implemented in any OCR system for multifont printed documents, based on an automatic pattern training. The goal is to increase the final performance of the recognizer by giving an appropriate stage which uses information about pattern redundancies in the text. Instead of making a standard word by word analysis, the proposed stage will correct a character pattern, by studying all the words in the text, which contain the character prototype. The proposed algorithm computes the substitution probabilities for each character prototype and stores all the character substitution possibilities. The right answer is given by the corrections which maximize the number of correct words found in the dictionary. The algorithm has also been modified to correct merged characters.<<ETX>>

[1]  R. Mahesh K. Sinha,et al.  Visual text recognition through contextual processing , 1988, Pattern Recognit..

[2]  H. Emptoz,et al.  A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[3]  R. Mahesh K. Sinha,et al.  On partitioning a dictionary for visual text recognition , 1990, Pattern Recognit..

[4]  Sargur N. Srihari,et al.  Integrating diverse knowledge sources in text recognition , 1982, TOIS.

[5]  Lindsay J. Evett,et al.  Fast dictionary look-up for contextual word recognition , 1990, Pattern Recognit..

[6]  Sargur N. Srihari,et al.  An Integrated Algorithm for Text Recognition: Comparison with a Cascaded Algorithm , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.