Combining Statistical and Syntactic Methods in Recognizing Handwritten Sentences

The output of handwritten word recognizers tends to be very noisy due to factors such as variable handwriting styles, distortions in the image data, etc. In order to compensate for this behaviour, several choices of the word recognizer are initially considered but eventually reduced to a single choice based on constraints posed by the particular domain. In the case of handwritten sentence/phrase recognition, linguistic constraints may be applied in order to improve the results of the word recognizer. Linguistic constraints can be applied as (i) a purely post-processing operation or (ii) in a feedback loop to the word recognizer. This paper discusses two statistical methods of applying syntactic constraints to the output of a handwritten word recognizer on input consisting of sentences/phrases. Both methods are based on syntactic categories (tags) associated with words. The first is a purely statistical method, the second is a hybrid method which combines higherlevel syntactic information (hypertags) with statistical information regarding transitions between hypertags. We show the utility of both these approaches in the problem of handwritten sentence/phrase recognition.