Improving precision and recall for Soundex retrieval

We present a phonetic algorithm for name searches that fuses existing techniques [the Soundex system of Russell and the techniques of J. Celko (1995) and U. Pfeifer et al.] and that introduces new features. This combination offers improved precision and recall. The described experiments assign multiple phonetic codes to each name. Counting common phonetic codes and digrams, the experiments implement the Dice coefficient to assign a similarity score between names. We use the Pfeifer corpus and relevance assessments to compare and contrast our experimental results with traditional techniques.