Enhancing degraded document images via bitmap clustering and averaging

Proper display and accurate recognition of document images are often hampered by degradations caused by poor scanning or transmission conditions. The authors propose a method to enhance such degraded document images for better display quality and recognition accuracy. The essence of the method is in finding and averaging bitmaps of the same symbol that are scattered across a text page. Outline descriptions of the symbols are then obtained that can be rendered at arbitrary solution. The paper describes details of the algorithm and an experiment to demonstrate its capabilities using fax images.

[1]  W. Press,et al.  Numerical Recipes in Fortran: The Art of Scientific Computing.@@@Numerical Recipes in C: The Art of Scientific Computing. , 1994 .

[2]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[3]  John D. Hobby,et al.  Matching document images with ground truth , 1998, International Journal on Document Analysis and Recognition.

[4]  K. S. Baird,et al.  Anatomy of a versatile page reader , 1992, Proc. IEEE.

[5]  John D. Hobby,et al.  Polygonal approximations that minimize the number of inflections , 1993, SODA '93.

[6]  Tin Kam Ho,et al.  Perfect metrics , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[7]  Henry S. Baird,et al.  Language-free layout analysis , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[8]  Kazuhiko Yamamoto,et al.  Research on Machine Recognition of Handprinted Characters , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  John M. Danskin,et al.  Bitmap reconstruction for document image compressionQin , 1996 .

[10]  D. Hobby,et al.  Degraded Character Image RestorationJohn , 1996 .

[11]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[12]  Stephen V. Rice,et al.  An Evaluation of OCR Accuracy , 1993 .

[13]  Jonathan J. Hull,et al.  Improving ocr performance with word image equivalence , 1995 .

[14]  Leonidas J. Guibas,et al.  A kinetic framework for computational geometry , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).