Paper to PDA

A system is described for the automatic analysis of a document image into atomic fragments (e.g. word images) that can be reconstructed or "reflowed" onto a display device of arbitrary size, depth, and aspect ratio. The main intent is to allow scans and other page-image documents to be viewed effectively on a limited-resolution hand-held computing device, without any errors and losses due to OCR and retypesetting. The methods of image analysis and representation are described.

[1]  T. Cockerell,et al.  THE FLORA OF CALIFORNIA. , 1937, Science.

[2]  Søren Forchhammer,et al.  Lossy/lossless coding of bi-level images , 1997, Proceedings DCC '97. Data Compression Conference.

[3]  Dorothea Blostein,et al.  Handbook of Character Recognition and Document Image Analysis , 1997 .

[4]  Sargur N. Srihari,et al.  Representing OCRed documents in HTML , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[5]  Thomas M. Breuel Robust least-square-baseline finding using a branch and bound algorithm , 2001, IS&T/SPIE Electronic Imaging.

[6]  not Cwi,et al.  XHTML™ 1.0 The Extensible HyperText Markup Language , 2002 .