A corpus comparison approach for terminology extraction

This article examines one of the possible approaches to identifying technical terms: a corpus comparison approach using the range and frequency of word forms. In order to identify terms using the corpus comparison approach, a ratio is used as a tool based on the comparative range and frequency of word forms between a technical corpus and a comparison corpus. A rating scale approach is used as the basis for evaluating the corpus comparison approach.The analysis shows that the corpus comparison approach works reasonably well with around 86% overlap with the results from the rating scale approach. It also shows that the corpus comparison approach using word types is a reasonably simple and practical way of identifying terms.