Anatomy of a form reader

Forms are used extensively in today's offices. The task of an automated form reader is to locate data filled on a form and to encode the content into appropriate symbolic descriptions. The challenges in form reading are due to high volume and large variety. A robust form reader with high adaptability and trainability. The form reader consists of two modules: field registration and data recognition module. The field registration module acquires knowledge about the forms of interest and the data recognition module recognizes text data on filled forms using the acquired knowledge. The capability of the reader increases progressively through supervised learning. The form reader has been training to read a large variety of forms with machine-printed data. The adaptability and trainability of the system have been demonstrated through the experiments.<<ETX>>