Fundamental Properties of the Core Matching Functions for Information Retrieval

Traditional benchmarking methods for information retrieval (IR) are based on experimental performance evaluation. Although the metrics precision and recall can measure the performance of a system, it cannot assess the functionality of the underlying model. Recently, a theory of “aboutness” has been studied and used for reasoning about functional of IR models. Latest research shows the functionality of an IR model is largely determined by its retrieval mechanism, i.e., the matching function; in particular, containment and overlapping (either with or without a threshold value) are core IR matching functions. The objective of this paper is to model the containment and overlapping matching functions within an aboutness-based framework, reason and analyze the inherent functionality of them from an abstract and theoretical viewpoint. Separate aboutness relations for containment, pureoverlapping (i.e. without threshold) and threshold-overlapping are defined, and the sets of properties supported by them are derived and analyzed respectively. These three relations can be used to explain the functionality supported by an IR system and their effects to its performance; and moreover, they provide the design guidelines for new IR systems.