The Relevance of Recall and Precision in User Evaluation

The appropriateness of evaluation criteria and measures have been a subject of debate and a vital concern in the information retrieval evaluation literature. A study was conducted to investigate the appropriateness of 20 measures for evaluating interactive information retrieval performance, representing four major evaluation criteria. Among the 20 measures studied were the two most well‐known relevance‐based measures of effectiveness, recall and precision. The user's judgment of information retrieval success was used as the devised criterion measure with which all other 20 measures were to be correlated. A sample of 40 end‐users with individual information problems from an academic environment were observed, interacting with six professional intermediaries searching on their behalf in large operational systems. Quantitative data consisting of values for all measures studied and verbal data containing users' reasons for assigning certain values to selected measures were collected. Statistical analysis of the quantitative data showed that precision, one of the most important traditional measures of effectiveness, is not significantly correlated with the user's judgment of success. Users appear to be more concerned with absolute recall than with precision, although absolute recall was not directly tested in the study. Four related measures of recall and precision are found to be significantly correlated with success. Among these are user's satisfaction with completeness of search results and user's satisfaction with precision of the search. This article explores the possible explanations for this outcome through content analysis of users' verbal data. The analysis shows that high precision does not always mean high quality (relevancy, completeness, etc.) to users because of different users' expectations. The user's purpose in obtaining information is suggested to be the primary cause for the high concern for recall. Implications for research and practice are discussed. © 1994 John Wiley & Sons, Inc.