What's related? Generalizing approaches to related articles in medicine

INTRODUCTION We did formative evaluations of several variations to the computation of related articles for non-bibliographic resources in the medical domain. METHODS A binary model and several variations of the vector space model were used to measure similarity between documents. Two corpora were studied, using a human expert as the gold standard. RESULTS Variations in term weights and stopword choices made little difference to performance. Performance was worse when documents were characterized by title words alone or by MeSH terms extracted from document references. DISCUSSION Further studies are needed to evaluate these methods in medical information retrieval systems.