Neural network-based proper names extraction in fax images

In this paper, we are interested in the sender's name extraction in fax cover pages through a machine learning scheme. For this purpose, two analysis methods are implemented to work in parallel. The first one is based on image document analysis (OCR recognition, physical block selection), the other on text analysis (word feature extraction, local grammar rules). Our main contribution consisted in introducing a neural network to find an optimal combination of the two approaches. Tests carried on real fax images show that the neural network improves performance compared to an empirical combination function and to each method used separately.

[1]  Andreas Dengel,et al.  Message extraction from printed documents-a complete solution , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[2]  Laurence Likforman-Sulem Name block location in facsimile images using spatial/visual cues , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[3]  Francesca Cesarini,et al.  INFORMys: A Flexible Invoice-Like Form-Reader System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Sargur N. Srihari,et al.  Location of name and address on fax cover pages , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[5]  Hassan Alam,et al.  FaxAssist: an automatic routing of unconstrained fax to email location , 1999, Electronic Imaging.

[6]  Richard M. Schwartz,et al.  Nymble: a High-Performance Learning Name-finder , 1997, ANLP.