论文信息 - Fast string matching using an n‐gram algorithm

Fast string matching using an n‐gram algorithm

Experimental results are given for the application of a new n‐gram algorithm to substring searching in DNA strings. The results confirm theoretical predictions of expected running times based on the assumption that the data are drawn from a stationary ergodic source. They also confirm that the algorithms tested are the most efficient known for searches involving larger patterns.

John Shawe-Taylor | Jong Yong Kim | J. Shawe-Taylor | Jong Yong Kim

[1] Alfred V. Aho,et al. The Design and Analysis of Computer Algorithms , 1974 .

[2] Robert Schaback,et al. On the Expected Sublinearity of the Boyer-Moore Algorithm , 1988, SIAM J. Comput..

[3] John Shawe-Taylor,et al. Fast String Matching in Stationary Ergodic Sources , 1996, Combinatorics, probability & computing.

[4] Dominic J. A. Welsh,et al. Codes and cryptography , 1988 .

[5] Ching Y. Suen,et al. n-Gram Statistics for Natural Language Understanding and Text Processing , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Andrew Hume,et al. Fast string searching , 1991, USENIX Summer.

[7] Robert S. Boyer,et al. A fast string searching algorithm , 1977, CACM.

[8] Ricardo A. Baeza-Yates,et al. Improved string searching , 1989, Softw. Pract. Exp..

[9] Donald E. Knuth,et al. Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[10] R. Nigel Horspool,et al. Practical fast searching in strings , 1980, Softw. Pract. Exp..

[11] John Shawe-Taylor,et al. Fast Expected string Machine using an n-gram Algorithm , 1994 .

[12] Daniel Sunday,et al. A very fast substring search algorithm , 1990, CACM.