A Consensus Algorithm for Approximate String Matching

Abstract Approximate string matching (ASM) is a well-known computational problem with important applications in database searching, plagiarism detection, spelling correction, and bioinformatics. The two main issues with most ASM algorithms are (1) computational complexity, and (2) low specificity due to a large amount of false positives being reported. In this paper, a very efficient ASM method is proposed, along with a post -processing stage designed to significantly reduce the amount of false positives. Results with random strings show that the proposed method is capable of performing a search within a large (1 M b) string in about 100 ms, with a sensitivity and specificity of nearly 100%.