Optical font recognition for multi-font OCR and document processing

In this paper we present a multi-font OCR system to be employed for document processing, which performs, at the same time, both the character recognition and the font-style detection of the digits belonging to a subset of the existing fonts. The detection of the font-style of the document words can guide a rough automatic classification of documents, and can also be used to improve the character recognition. The system uses the tangent distance as a classification function in a nearest neighbour approach. We have to discriminate among different digits and, for the same character, we have to discriminate among different font-styles. The nearest neighbour approach is always able to recognize the digit, but the performance in font detection is not optimal. To improve the performance of the system, we have used a discriminant model, the TD-Neuron, which is employed to discriminate between two similar classes. Some experimental results and prospective use in document processing applications are presented.