Turbo recognition: a statistical approach to layout analysis

Turbo recognition (TR) is a communication theory approach to the analysis of rectangular layouts, in the spirit of Document Image Decoding. The TR algorithm, inspired by turbo decoding, is based on a generative model of image production, in which two grammars are used simultaneously to describe structure in orthogonal (horizontal and vertical directions. This enables TR to strictly embody non-local constraints that cannot be taken into account by local statistical methods. This basis in finite state grammars also allows TR to be quickly retargetable to new domains. We illustrate some of the capabilities of TR with two examples involving realistic images. While TR, like turbo decoding, is not guaranteed to recover the statistically optimal solution, we present an experiment that demonstrates its ability to produce optimal or near-optimal results on a simple yet nontrivial example, the recovery of a filled rectangle in the midst of noise. Unlike methods such as stochastic context free grammars and exhaustive search, which are often intractable beyond small images, turbo recognition scales linearly with image size, suggesting TR as an efficient yet near-optimal approach to statistical layout analysis.

[1]  Philip A. Chou,et al.  Document Image Decoding Using Markov Source Models , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Philip A. Chou,et al.  AN ITERATIVE DECODING APPROACH TO DOCUMENT IMAGE ANALYSIS , 1999 .

[3]  P. A. Chou,et al.  Recognition of Equations Using a Two-Dimensional Stochastic Context-Free Grammar , 1989, Other Conferences.

[4]  Jesse F Hull Recognition of mathematics using a two-dimensional trainable context-free grammar , 1996 .

[5]  R. Gray,et al.  Image Classi cation by a Two Dimensional Hidden Markov Model , 1998 .

[6]  Philip A. Chou,et al.  Stochastic attribute grammar model of document production and its use in document image decoding , 1995, Electronic Imaging.

[7]  Gary E. Kopec Document image decoding in the UC Berkeley Digital Library , 1996, Electronic Imaging.

[8]  Kenneth Rose,et al.  Deterministic annealing for trellis quantizer and HMM design using Baum-Welch re-estimation , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Masaru Tomita Parsing 2-Dimensional Language , 1989, IWPT.

[10]  Gary E. Kopec,et al.  Document Image Decoding by Heuristic Search , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[12]  Robert M. Gray,et al.  Image classification by a two-dimensional hidden Markov model , 2000, IEEE Trans. Signal Process..

[13]  William T. Freeman,et al.  On the fixed points of the max-product algorithm , 2000 .

[14]  A. Glavieux,et al.  Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.

[15]  Brendan J. Frey,et al.  Graphical Models for Machine Learning and Digital Communication , 1998 .

[16]  Oscar E. Agazzi,et al.  Pseudo two-dimensional hidden Markov models for document recognition , 1993, AT&T Technical Journal.