Automatic indexing of natural language texts: Cornell University

In information retrieval, the stored documents and records are normally identified by sets of terms or keywords that are collectively used to represent the document content. The task of assigning the terms to the individual documents is known as indexing. Automatic indexing procedures have been developed in recent years that outperform the conventional methods based either on a manual term assignment or on a full-text indexing where the words occurring in document texts are used as index terms.