Unconstrained handwriting recognition applied to the processing of bank cheques

A method for recognizing unconstrained handwritten words belonging to a small static lexicon is proposed. Previous approaches typically attempt to recognize characters or parts of characters in order to recognize words. Our approach, in its first step, bypasses the notion of characters. In addition to language independence, our method is more context oriented and should prove to be more robust against poor handwriting, spelling mistakes, noise and the like. Our computational theory is based on a psychological model of the reading process of a fast reader. First a few graphical clues such as ascenders, descenders and their relative positions are extracted from the word. If these prove not to be sufficient to clearly identify the word, then details (secondary features including first and last characters of words) are extracted to enhance the word recognition. We designed and collected a database of bank cheques both in English and French. This resulted in a one of its kind database in a university setting dealing with hand-written information from bank cheques, both in terms of the size of the database as well as the number of different writers involved. We further designed an innovative, simple yet powerful in place tagging procedure for our database. It enables us to extract at will not only the bitmaps of words, characters, digits, lines, commas, etc... but also all kinds of contextual information. We developed a fully trainable word recognizer with the requirement that the switch to a different database and/or language shall not require any redesign nor any extensive retraining time. The number of parameters within the system has been kept to a minimum and whenever possible we designed algorithms that require no parameters and therefore no training. Such an example is our slant correction algorithm that shines by its simplicity and robustness. Whenever parameters might need to be adjusted to a specific database, it is done automatically by running some genetic algorithms. We tested the generality and adaptability of our system on 2 different databases of bank cheques (respectively English and French). We noticed that the system's parameters did not need to be readjusted for it to perform satisfactorily when the switch was made from one database to the other. At the time of this dissertation, our survey indicates that this research is the only one in the literature which can handle English cheques and our results are comparable to those published on the processing of French cheques. Our preliminary results on the French database report a recognition rate of 98.9% and 94.3% on the word and the full legal amount respectively among the top 5 choices.