Measures of novelty in biomedical literature

We introduce several measures of novelty for a scientific article in MEDLINE based on the concepts associated with it. The concepts associated with an article are identified using the Medical Subject Headings (MeSH) assigned to the article. A temporal profile was computed for each MeSH term (and the combination of pairs of MeSH terms) based on their overall occurrences in MEDLINE, after which papers are labeled by their most novel MeSH [see Figure 1] and pairs of MeSH as measured in years and volume of prior work. Our approach is similar to earlier attempts aimed at measuring novelty of an article, e.g. by using the frequency of co-citations [2] and cooccurrence of keywords [1], however, it differs in its usage of pairwise concepts and a control vocabulary of MeSH terms. We use pair of concepts for quantifying novelty of an article because in principle all scientific publications present some novel concepts, however, it is rare for articles to coin new concepts which are widely adopted by the community. Furthermore, the pairing of existing concepts is quite common in science, this hypothesis is confirmed through our analysis. Across all papers in MEDLINE published since 1985, we find that individual concept novelty is rare (5.4% of papers have a MeSH <= 3 years old; 1.2% have a MeSH <= 20 papers old), while combinatorial novelty is the norm (55% have a pair of MeSH <= 3 years old; 78% have a pair of MeSH <= 20 papers old) [see Figure 2].