A Study of Extraction Methods for Incident Subject Terms Based on Left-Right Branch and Between-Class Distribution Entropies

In this paper, we firstly selected more informative phrases as candidate subject terms by using the left-right branch entropy as the basis of boundary recognition for subject terms, then filtered out partial noise words by the method of tracking candidate subject terms back to the original document collection, and finally determined the subject terms more characteristic for incident classes by the method of entropy transformation which can reveal the between-class weights of subject terms. The experimental results of four incident classes have proved the effectiveness and practical value of the two methods.