At the frontiers of OCR

It is argued that it is time for a major change of approach to optical character recognition (OCR) research. The traditional approach, focusing on the correct classification of isolated characters, has been exhausted. The demonstration of the superiority of a new classification method under operational conditions requires large experimental facilities and databases beyond the resources of most researchers. In any case, even perfect classification of individual characters is insufficient for the conversion of complex archival documents to a useful computer-readable form. Many practical OCR tasks require integrated treatment of entire documents and well-organized typographic and domain-specific knowledge. New OCR systems should take advantage of the typographic uniformity of paragraphs or other layout components. They should also exploit the unavoidable interaction with human operators to improve themselves without explicit 'training'. >

[1]  George Nagy,et al.  Self-corrective character recognition system , 1966, IEEE Trans. Inf. Theory.

[2]  R. W. Lucky,et al.  Techniques for adaptive equalization of digital communication systems , 1966 .

[3]  H. Genchi,et al.  Recognition of handwritten numerical characters for automatic letter sorting , 1968 .

[4]  R. B. Hennis The IBM 1975 optical page reader: part I: system design , 1968 .

[5]  Y. Ho,et al.  On a theoretical pattern recognition model of Ho and Agrawala , 1968 .

[6]  George Nagy,et al.  An Autonomous Reading Machine , 1968, IEEE Transactions on Computers.

[7]  M. R. Bartz,et al.  The IBM 1975 optical page reader: part II: video thresholding system , 1968 .

[8]  EDWARD M. RISEMAN,et al.  Contextual Word Recognition Using Binary Digrams , 1971, IEEE Transactions on Computers.

[9]  George Nagy,et al.  An Interactive System for Reading Unformatted Printed Text , 1971, IEEE Transactions on Computers.

[10]  R. Casey,et al.  Advances in Pattern Recognition , 1971 .

[11]  Allen R. Hanson,et al.  A Contextual Postprocessing System for Error Correction Using Binary n-Grams , 1974, IEEE Transactions on Computers.

[12]  George Nagy,et al.  A Means for Achieving a High Degree of Compaction on Scan-Digitized Printed Text , 1974, IEEE Transactions on Computers.

[13]  Theodosios Pavlidis,et al.  Computer Recognition of Handwritten Numerals by Polygonal Approximations , 1975, IEEE Transactions on Systems, Man, and Cybernetics.

[14]  Godfried T. Toussaint,et al.  The use of context in pattern recognition , 1978, Pattern Recognit..

[15]  A. Rosenfeld Image pattern recognition , 1981, Proceedings of the IEEE.

[16]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[17]  Friedrich M. Wahl,et al.  Block segmentation and text extraction in mixed text/image documents , 1982, Comput. Graph. Image Process..

[18]  George Nagy,et al.  29 Optical character recognition - Theory and practice , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[19]  Richard G. Casey,et al.  A Processor-Based OCR System , 1983, IBM J. Res. Dev..

[20]  George Nagy Candide's Practical Principles of Experimental Pattern Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  George Nagy,et al.  Optical Scanning Digitizers , 1983, Computer.

[22]  George Nagy,et al.  Decision tree design using a probabilistic model , 1984, IEEE Trans. Inf. Theory.

[23]  Theodosios Pavlidis,et al.  On the Recognition of Printed Characters of Any Font and Size , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  George Nagy,et al.  Decoding Substitution Ciphers by Means of Word Matching with Application to OCR , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  William S. Havens,et al.  Knowledge Structuring and Constraint Satisfaction: The Mapsee Approach , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  加藤 俊一,et al.  Architecture and User Interface of Intelligent Multimedia Database System TRADEMARK , 1988 .

[27]  Andreas Dengel,et al.  Document Description and Analysis by Cuts , 1988, RIAO.

[28]  Seymour Shlien Multifont Character Recognition for Typeset Documents , 1988, Int. J. Pattern Recognit. Artif. Intell..

[29]  G. Nagy,et al.  Chinese character recognition: a twenty-five-year retrospective , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[30]  Yoshihiro Shima,et al.  A new method of document structure extraction using generic layout knowledge , 1989, International Workshop on Industrial Applications of Machine Intelligence and Vision,.

[31]  David R. Ferguson,et al.  Intelligent Forms Processing , 1990, IBM Syst. J..

[32]  Ken Thompson,et al.  Reading Chess , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Sargur N. Srihari,et al.  Reading newspaper text , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[34]  D. T. Wang,et al.  Structured Document Image Analysis, IAPR Workshop on Syntactic and Structural Pattern Recognition, 13-15 June 1990, Murray Hill, NJ, USA , 1990 .

[35]  John C. Handley,et al.  Merging optical character recognition outputs for improved accuracy , 1991, RIAO.

[36]  Lance Tokuda,et al.  Visual parsing: an aid to text understanding , 1991, RIAO.

[37]  Mahesh Viswanathan Analysis of Scanned Documents — a Syntactic Approach , 1992 .

[38]  Rainer Hoch,et al.  Fragmentary string matching by selective access to hybrid tries , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[39]  Amar Mitiche,et al.  Optical character recognition by a neural network , 1992, Neural Networks.

[40]  George Nagy,et al.  Towards a Structured-Document-Image Utility , 1992 .

[41]  Masayuki Okamoto,et al.  An Experimental Implementation of a Document Recognition System for Papers Containing Mathematical Expressions , 1992 .