Performance Analysis of Semantic Indexing in Text Retrieval

We developed a new indexing formalism that considers not only the terms in a document, but also the concepts to represent the semantic content of a document. In this approach, concept clusters are defined and a concept vector space model is proposed to represent the semantic importance of words and concepts within a document. Through experiments on the TREC-2 collection, we show that the proposed method outperforms an indexing method based on term frequency.