Levenshtein Distance Technique in Dictionary Lookup Methods: An Improved Approach

Dictionary lookup methods are popular in dealing with ambiguous letters which were not recognized by Optical Character Readers. However, a robust dictionary lookup method can be complex as apriori probability calculation or a large dictionary size increases the overhead and the cost of searching. In this context, Levenshtein distance is a simple metric which can be an effective string approximation tool. After observing the effectiveness of this method, an improvement has been made to this method by grouping some similar looking alphabets and reducing the weighted difference among members of the same group. The results showed marked improvement over the traditional Levenshtein distance technique.

[1]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[2]  Eric Atwell,et al.  Dealing with ill-formed English text , 1987 .

[3]  Bi Liu,et al.  A Normalized Levenshtein Distance Metric , 2007, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Claus Vielhauer,et al.  Similarity searching for on-line handwritten documents , 2008, Journal on Multimodal User Interfaces.

[5]  Peter Willett,et al.  Automatic Spelling Correction Using a Trigram Similarity Measure , 1983, Inf. Process. Manag..

[6]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[7]  Jana Dittmann,et al.  Using adapted Levenshtein distance for on-line signature authentication , 2004, ICPR 2004.