A similarity measurement of clinical trials using SNOMED — A preliminary study

There is an increasing need to accurately and efficiently find relevant clinical trials for patients, practitioners, and researchers. This paper proposes a method for measuring the similarity among clinical trials and explores its potential uses in efficiently suggesting relevant clinical trials. SNOMED terms are applied to extract and normalize the clinical trial titles (CTTs). Similarity matrices are calculated automatically based on the similarity measures. One thousand three hundred and sixty CTTs were extracted covering the top five diseases - heart disease, cancer, stroke, diabetes, and lung disease - leading to death in the United States contained in ClinicalTrial.gov. Five similarity matrices are generated for the five diseases, respectively. Results show that 1.2% of the clinical trials pairs have close similarities. Clinical trials for diabetes have the highest average similarity ratio. Future research with clinical trials will use multiple methods such as ontological and statistical approaches to improve the precision and recall of the search results.