Retrieval Models

Feature A characteristic property of a document. Usually, a document’s terms are used as features, but virtually every measurable document property can be chosen, such as word classes, average sentence lengths, principal components of term-document-occurrence matrices, term synonyms, etc. Information need Specifically here: A lack of information or knowledge that can be satisfied by a set of text documents. Query Specifically here: A small set of terms that expresses a user’s information need. Relevance The extent to which a document is capable to satisfy an information need. Within probabilistic retrieval models, relevance is modeled as a binary random variable.