Postprocessing of Recognized Strings Using

This paper presents Nonstationary Markovian Models and their application to recognition of strings of tokens. Domain specific knowledge is brought to bear on the application of recognizing zip Codes in the U.S. mailstream by the use of postal directory files. These files provide a wealth of information on the delivery points (mailstops) corresponding to each zip code. This data feeds into the models as n-grams, statistics that are seamlessly integrated with recognition scores of digit images. An especially interesting facet of the model is its ability to excite and inhibit certain positions in the n-grams leading to the familiar area of Markov Random Fields. The authors have previously described elsewhere (2) a methodology for deriving probability values from recognizer scores. These probability measures allow the Markov chain to be constructed in a truly Bayesian framework. We empirically illustrate the success of Markovian modeling in postprocessing applications of string recognition. We present the recognition accuracy of the different models on a set of 20,000 zip codes. The performance is superior to the present system which ignores all contextual information and simply relies on the recognition scores of the digit recognizers. Index Terms—Nonstationary hidden Markov models, zip code recognition, postprocessing, class conditional probability, Markov random fields.

[1]  Fumitaka Kimura,et al.  Handwritten numerical recognition based on multiple algorithms , 1991, Pattern Recognit..

[2]  Gyeonghwan Kim,et al.  A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Yves Lecourtier,et al.  Optimal Order of Markov Models Applied to Bankchecks , 1997, Int. J. Pattern Recognit. Artif. Intell..

[4]  Godfried T. Toussaint,et al.  Experiments in Text Recognition with the Modified Viterbi Algorithm , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  T. W. Anderson,et al.  Statistical Inference about Markov Chains , 1957 .

[6]  Djamel Bouchaffra,et al.  Incorporating diverse information sources in handwriting recognition postprocessing , 1996 .

[7]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[8]  Venu Govindaraju,et al.  Segmentation and recognition of connected handwritten numeral strings , 1997, Pattern Recognit..

[9]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[10]  Ching Y. Suen,et al.  Structural classification and relaxation matching of totally unconstrained handwritten zip-code numbers , 1988, Pattern Recognit..

[11]  Malayappan Shridhar,et al.  Context-directed segmentation algorithm for handwritten numeral strings , 1987, Image Vis. Comput..

[12]  Geetha Srikantan,et al.  A multiple feature/resolution approach to handprinted digit and character recognition , 1996 .

[13]  Jonathan J. Hull Incorporating Language Syntax in Visual Text Recognition with a Statistical Model , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Rajjan Shinghal,et al.  A hybrid classifier for recognizing handwritten numerals , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[15]  Jian Zhou,et al.  Off-Line Handwritten Word Recognition Using a Hidden Markov Model Type Stochastic Network , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.