Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review

We enhance the autonomy of the continuous active learning method shown by Cormack and Grossman (SIGIR 2014) to be effective for technology-assisted review, in which documents from a collection are retrieved and reviewed, using relevance feedback, until substantially all of the relevant documents have been reviewed. Autonomy is enhanced through the elimination of topic-specific and dataset-specific tuning parameters, so that the sole input required by the user is, at the outset, a short query, topic description, or single relevant document; and, throughout the review, ongoing relevance assessments of the retrieved documents. We show that our enhancements consistently yield superior results to Cormack and Grossman's version of continuous active learning, and other methods, not only on average, but on the vast majority of topics from four separate sets of tasks: the legal datasets examined by Cormack and Grossman, the Reuters RCV1-v2 subject categories, the TREC 6 AdHoc task, and the construction of the TREC 2002 filtering test collection.

[1]  Philip M. Long,et al.  Practical learning from one-sided feedback , 2007, KDD '07.

[2]  Mark Sanderson,et al.  Forming test collections with no system pooling , 2004, SIGIR '04.

[3]  David C. Gibbon,et al.  Support vector machines: relevance feedback and information retrieval , 2002, Inf. Process. Manag..

[4]  Justin Zobel,et al.  How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[5]  Gordon V. Cormack,et al.  Machine Learning for Information Retrieval: TREC 2009 Web, Relevance Feedback and Legal Tracks , 2009, TREC.

[6]  Maura R. Grossman,et al.  Evaluation of machine-learning protocols for technology-assisted review in electronic discovery , 2014, SIGIR.

[7]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[8]  Douglas W. Oard,et al.  Overview of the TREC 2009 Legal Track , 2009, TREC.

[9]  M. E. Maron,et al.  An evaluation of retrieval effectiveness for a full-text document-retrieval system , 1985, CACM.

[10]  Douglas W. Oard,et al.  Overview of the TREC 2008 Legal Track , 2008, TREC.

[11]  Michele Tarsilla Cochrane Handbook for Systematic Reviews of Interventions , 2010, Journal of MultiDisciplinary Evaluation.

[12]  Charles L. A. Clarke,et al.  Efficient construction of large test collections , 1998, SIGIR '98.

[13]  J. Glanville,et al.  Searching for Studies , 2008 .

[14]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[15]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[16]  Douglas W. Oard,et al.  Overview of the TREC 2011 Legal Track , 2011, TREC.

[17]  Christopher Hogan,et al.  H5 at TREC 2008 Legal Interactive: User Modeling, Assessment & Measurement , 2008, TREC.

[18]  Ellen M. Voorhees,et al.  Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[19]  Stephen E. Robertson,et al.  Building a filtering test collection for TREC 2002 , 2003, SIGIR.

[20]  Douglas W. Oard,et al.  Overview of the TREC 2010 Legal Track , 2010, TREC.

[21]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[22]  David C. Blair STAIRS Redux: Thoughts on the STAIRS Evaluation, Ten Years after , 1996, J. Am. Soc. Inf. Sci..