Effective and Efficient Retrieval from Large and Dynamic Document Collections

A new retrieval method together with a new access structure is presented that is aimed at a high update efficiency and a high retrieval effectiveness. The access structure consists of signature and non -inverted descriptions. This access structure can be updated efficiently because the description of a single document is stored in a compact form. The signatures are used to compute approximate retrieval values first, and the non-inverted descriptions are then used to determine the final list of documents ranked by the exact retrieval status values. Our basic approach based on the standard tf*idf weighting scheme has been improved in in both retrieval effectiveness and retrieval efficiency. On an average, the time for retrieving the top ranked document is clearly below two second while the document collection can be updated in 10 msec. (inserting, deleting, or modifying a document description