Experimental evaluation of passage-based document retrieval

Retrieval of electronic documents is a fundamental component for intelligent access to the contents of documents. For the retrieval of long documents, a method called passage-based document retrieval has proven to be effective. In this paper we experimentally show that the passage-based retrieval is also advantageous for dealing with short queries on condition that documents are long. We employ a passage-based method based on density distributions of query terms in documents, and compare it with three conventional methods: the vector space model, pseudo-feedback and latent semantic indexing.