Barrier word method for detecting molecular biology multiple word terms

Abstract Multiple-word biomedical terms may point to concepts in bibliographic citations with greater precision than individual words. The barrier word method detects multiple-word terms at first encounter in narrative text. Words with low biomedical information content (prepositions, articles, etc.) are designated as barrier words; each word sequence occurring between consecutive barrier words is a candidate multiple-word term. In 1407 consecutive titles and abstracts listed under DNA, RECOMBINANT (D13.444.308.460), there were 1,275 barrier words, and 13,548 multiple-word terms were selected. Results demonstrate an effective method for detecting multiple-word terms in molecular biology narrative text.