Document Image Decoding Using Markov Source Models

Document image decoding (DID) is a communication theory approach to document image recognition. In DID, a document recognition problem is viewed as consisting of three elements: an image generator, a noisy channel and an image decoder. A document image generator is a Markov source (stochastic finite-state automaton) that combines a message source with an imager. The message source produces a string of symbols, or text, that contains the information to be transmitted. The imager is modeled as a finite-state transducer that converts the 1D message string into an ideal 2D bitmap. The channel transforms the ideal image into a noisy observed image. The decoder estimates the message, given the observed image, by finding the a posteriori most probable path through the combined source and channel models using a Viterbi-like dynamic programming algorithm. The proposed approach is illustrated on the problem of decoding scanned telephone yellow pages to extract names and numbers from the listings. A finite-state model for yellow page columns was constructed and used to decode a database of scanned column images containing about 1100 individual listings. >

[1]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[2]  Sailesh K. Rao,et al.  Regular interactive algorithms and their implementations on processor arrays , 1986 .

[3]  Richard M. Karp,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.

[4]  Postscript language reference manual, Second Edition , 1991 .

[5]  Dan S. Bloomberg,et al.  Word spotting in scanned images using hidden Markov models , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Leslie Lamport,et al.  Latex : A Document Preparation System , 1985 .

[7]  M. W. Shields An Introduction to Automata Theory , 1988 .

[8]  Historie Zdroje,et al.  Latex , 1910, Botanical Gazette.

[9]  Thomas Kailath,et al.  Regular iterative algorithms and their implementation on processor arrays , 1988, Proc. IEEE.

[10]  Gary E. Kopec Least-squares font metric estimation from images , 1993, IEEE Trans. Image Process..

[11]  Harold Abelson,et al.  Turtle geometry : the computer as a medium for exploring mathematics , 1983 .

[12]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[13]  Sun-Yuan Kung,et al.  Hidden Markov models for character recognition , 1992, IEEE Trans. Image Process..

[14]  Roberto Pieraccini,et al.  Connected and degraded text recognition using planar hidden Markov models , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[17]  Gary E. Kopec,et al.  Editing images of text , 1994, CACM.

[18]  M KarpRichard,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967 .

[19]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[20]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[21]  Charles F. Goldfarb,et al.  SGML handbook , 1990 .

[22]  Paramvir Bahl,et al.  Recognition of handwritten word: First and second order hidden Markov model based approach , 1989, Pattern Recognit..

[23]  Henry S. Baird,et al.  Document image defect models , 1995 .

[24]  Przemyslaw Prusinkiewicz,et al.  Lindenmayer Systems, Fractals, and Plants , 1989, Lecture Notes in Biomathematics.

[25]  R. Rubinstein Introduction to Digital Typography , 1988 .

[26]  Jian Zhou,et al.  Off-Line Handwritten Word Recognition Using a Hidden Markov Model Type Stochastic Network , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Masaru Tomita Parsing 2-Dimensional Language , 1989, IWPT.

[28]  P. A. Chou,et al.  Recognition of Equations Using a Two-Dimensional Stochastic Context-Free Grammar , 1989, Other Conferences.