Hough technique for bar charts detection and recognition in document images

Charts are common graphic representation for scientific data in technical and business papers. We present a robust system for detecting and recognizing bar charts. The system includes three stages, preprocessing, detection and recognition. The kernel algorithm in detection is newly developed modified probabilistic Hough transform algorithm for parallel lines clusters detection. The main algorithms in recognition are bar pattern reconstruction and text primitives grouping in the Hough space which are also original. The experiments show the system can also recognize slant bar charts, or even hand-drawn charts.

[1]  James R. Bergen,et al.  A Probabilistic Algorithm for Computing Hough Transforms , 1991, J. Algorithms.

[2]  Erkki Oja,et al.  A new curve detection method: Randomized Hough transform (RHT) , 1990, Pattern Recognit. Lett..

[3]  Seong-Whan Lee,et al.  Recognizing Hand-Drawn Electrical Circuit Symbols with Attributed Graph Matching , 1992 .

[4]  Toyohide Watanabe,et al.  Layout-Based Approach for Extracting Constructive Elements of Bar-Charts , 1997, GREC.

[5]  Jiri Matas,et al.  Progressive probabilistic Hough transform for line detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[6]  Ying Li,et al.  A system for efficient and robust map symbol recognition , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[7]  Ioannis A. Kakadiaris,et al.  Understanding diagrams in technical documents , 1992, Computer.

[8]  Josef Kittler,et al.  A survey of the hough transform , 1988, Comput. Vis. Graph. Image Process..