Edit Distance of Regular Languages

The edit distance of a pair of strings, and of a string and a language are well known concepts that have various applications in optical character recognition (OCR) and document image analysis (DIA). In the present paper, a generalization is proposed, viz., the edit distance of two regular languages. A method for the computation of the edit distance of two regular languages is introduced and its correctness is shown. Also, applications in the areas of OCR and DIA are discussed.

[1]  Sargur N. Srihari,et al.  Computer Text Recognition and Error Correction , 1985 .

[2]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[3]  Alan Conway,et al.  Page grammars and page parsing. A syntactic approach to document layout recognition , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[4]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[5]  Mahesh Viswanathan Analysis of Scanned Documents — a Syntactic Approach , 1992 .

[6]  Rolf Ingold,et al.  A language for document generic layout description and its use for segmentation into regions , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[7]  Robert A. Wagner,et al.  Order-n correction for regular languages , 1974, CACM.

[8]  Alfred V. Aho,et al.  A Minimum Distance Error-Correcting Parser for Context-Free Languages , 1972, SIAM J. Comput..

[9]  Stephen V. Rice,et al.  A DIFFERENCE ALGORITHM FOR OCR-GENERATED TEXT , 1993 .

[10]  E. Myers,et al.  Approximate matching of regular expressions , 1989 .

[11]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[12]  Mike Paterson,et al.  A Faster Algorithm Computing String Edit Distances , 1980, J. Comput. Syst. Sci..

[13]  Kurt Mehlhorn,et al.  Data Structures and Algorithms 2: Graph Algorithms and NP-Completeness , 1984, EATCS Monographs on Theoretical Computer Science.

[14]  Horst Bunke,et al.  A system for segmenting and recognising totally unconstrained handwritten numeral strings , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[15]  Horst Bunke,et al.  A fast algorithm for finding the nearest neighbor of a word in a dictionary , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[16]  Horst Bunke,et al.  Classification and postprocessing of documents using an error-correcting parser , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[17]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .