Chart analysis and recognition in document images

Hidden Markov models are a probabilistic modeling tool for time series data. It has been successfully applied to many areas, such as speech recognition, hand-written character recognition, etc. In this paper, we present a novel statistical approach using ergodic hidden Markov models to recognize scientific charts. We also present a newly developed feature extraction method for chart images. Unlike traditional primitive-based diagram recognition method, our approach need not recognize the graphic primitives in charts thus bypassing the recognition error problem caused by the inaccurate primitive extraction that is also a major obstacle to the construction of a general chart recognition system.