A Comparison of Term Value Measurements for Automatic Indexing

A number of automatic theories have been proposed over the last few years leading to the assignment of significance values to linguistic entities in accordance with their importance for purposes of content representation. Among these are methodologies based on decision theory, information theory, communication theory, vector space transformation and others. An attempt is made to compare these theories by exhibiting the formal frequency characteristics which underlie them. The effectiveness of the various approaches is also evaluated in experimental situations by using collections of documents in the areas of aerodynamics, medicine and world affairs.