Discovering Most Classificatory Patterns for Very Expressive Pattern Classes

The classificatory power of a pattern is measured by how well it separates two given sets of strings. This paper gives practical algorithms to find the fixed/variable-length-don't-care pattern (FVLDC pattern) and approximate FVLDC pattern which are most classificatory for two given string sets. We also present algorithms to discover the best window-accumulated FVLDC pattern and window-accumulated approxi- mate FVLDC pattern. All of our new algorithms run in practical amount of time by means of suitable pruning heuristics and fast pattern matching techniques.