An Efficient Tool for Discovering Simple Combinatorial Patterns from Large Text Databases

In this poster, we present demonstration of a prototype system for efficient discovery of combinatorial patterns, called proximity word-association patterns, from a collection of texts. The algorithm computes the best k-proximity d-word patterns in almost linear expected time in the total input length n, which is drastically faster than a straightforward algorithm of O(n2d+1) time complexity.