Minimal Discriminating Words Problem Revisited

We revisit two variants of the problem of computing minimal discriminating words studied in [5]. Given a pattern P and a threshold d, we want to report (i) all shortest extensions of P which occur in less than d documents, and (ii) all shortest extensions of P which occur only in d selected documents. For the first problem, we give an optimal solution with constant time per output word. For the second problem, we propose an algorithm with running time O(|P| + d·(1 + output)) improving the solution of [5].