论文信息 - Automatic and Semi-Automatic Document Selection for Technology-Assisted Review

Automatic and Semi-Automatic Document Selection for Technology-Assisted Review

Abstract In the TREC Total Recall Track (2015-2016), participating teams could employ either fully automatic or human-assisted ("semi-automatic") methods to select documents for relevance assessment by a simulated human reviewer. According to the TREC 2016 evaluation, the fully automatic baseline method achieved a recall-precision breakeven ("R-precision") score of 0.71, while the two semi-automatic efforts achieved scores of 0.67 and 0.51. In this work, we investigate the extent to which the observed effectiveness of the different methods may be confounded by chance, by inconsistent adherence to the Track guidelines, by selection bias in the evaluation method, or by discordant relevance assessments. We find no evidence that any of these factors could yield relative effectiveness scores inconsistent with the official TREC 2016 ranking.

Maura R. Grossman | Gordon V. Cormack | Adam Roegiest

[1] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[2] Maura R. Grossman,et al. TREC 2016 Total Recall Track Overview , 2016, TREC.

[3] D. Horvitz,et al. A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[4] Jeremy Pickens,et al. An Exploration of Total Recall with Multiple Manual Seedings , 2016, TREC.

[5] Mark Sanderson,et al. Forming test collections with no system pooling , 2004, SIGIR '04.

[6] Tefko Saracevic,et al. Why Is Relevance Still the Basic Notion in Information Science? - (Despite Great Advances in Information Technology) , 2015, ISI.

[7] Jim Sullivan,et al. e-Discovery Team at TREC 2015 Total Recall Track , 2015, TREC.