Recent Work in the Document Image Decoding Group at Xerox PARC

Speed Enhancements to DID (Section 2) When Document Image Decoding (DID) was proposed [15], its attractiveness lay primarily in its potential for high recognition accuracy, owing to its communications-theoretic framework, and well defined models and objective function (posterior probability). In its initial implementations it suffered from high computational cost relative to commercial OCR methods. We will summarize recent progress made on reducing its computational cost. Importantly, these speed enhancements do not come at the expense of accuracy; they are guaranteed to result in the same recognition output as DID without the enhancements.

[1]  Rolf Johannesson,et al.  Fundamentals of Convolutional Coding , 1999 .

[2]  George Nagy,et al.  Style consistency in pattern fields , 2000 .

[3]  Robert M. Haralick,et al.  Document structure analysis and performance evaluation , 1999 .

[4]  Justin K. Romberg,et al.  Adding linguistic constraints to document image decoding: comparing the iterated complete path and stack algorithms , 2000, IS&T/SPIE Electronic Imaging.

[5]  A. Lawrence Spitz Style-Directed Document Recognition , 1999 .

[6]  Thomas M. Breuel,et al.  Modeling the sample distribution for clustering OCR , 2000, IS&T/SPIE Electronic Imaging.

[7]  Gary E. Kopec,et al.  Document Image Decoding by Heuristic Search , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[9]  Thomas M. Breuel,et al.  Classification by probabilistic clustering , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[10]  David S. Doermann,et al.  The Indexing and Retrieval of Document Images: A Survey , 1998, Comput. Vis. Image Underst..

[11]  Azriel Rosenfeld,et al.  The Development of a General Framework for Intelligent Document Image Retrieval , 1996, DAS.

[12]  Kris Popat,et al.  Decoding of text lines in grayscale document images , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[13]  Henry S. Baird,et al.  Language-free layout analysis , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[14]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[15]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[16]  Philip A. Chou,et al.  Document Image Decoding Using Markov Source Models , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..