A Blueprint of IR Evaluation Integrating Task and User Characteristics: Test Collection and Evaluation Metrics

Relevance is generally understood as a multi-level and multi-dimensional relationship between an information need and an information object. However, traditional IR evaluation metrics naively assume mono-dimensionality. We ask: How to deal with multidimensional and graded relevance assessments in IR evaluation? Moreover, search result evaluation metrics neglect document overlaps and naively assume gains piling up as the searcher examines the ranked list into greater length. Consequently, we examine: How to deal with document overlap in IR evaluation? The usability of a document for a person-in-need also depends on document usability attributes beyond relevance. Therefore, we ask: How to deal with usability attributes, and how to combine this with multidimensional relevance assessments in IR evaluation? Finally, we ask how to define a formal model, which deals with multidimensional graded relevance assessments, document overlaps, and document usability attributes in a coherent framework serving IR evaluation?

[1]  Marcin Mironczuk,et al.  A recent overview of the state-of-the-art elements of text classification , 2018, Expert Syst. Appl..

[2]  Pertti Vakkari,et al.  Information Search Processes in Complex Tasks , 2018, CHIIR.

[3]  Benno Stein,et al.  An Information Nutritional Label for Online Documents , 2018, SIGIR Forum.

[4]  Kalervo Järvelin,et al.  Search task features in work tasks of varying types and complexity , 2017, J. Assoc. Inf. Sci. Technol..

[5]  Pia Borlund,et al.  A study of the use of simulated work task situations in interactive information retrieval evaluations: A meta-evaluation , 2016, J. Documentation.

[6]  Nicola Ferro,et al.  The twist measure for IR evaluation: Taking user's effort into account , 2015, J. Assoc. Inf. Sci. Technol..

[7]  Pertti Vakkari,et al.  Searching as learning: A systematization based on literature , 2016, J. Inf. Sci..

[8]  Nicholas J. Belkin,et al.  An exploration of the relationships between work task and interactive information search behavior , 2010, J. Assoc. Inf. Sci. Technol..

[9]  Kalervo Järvelin,et al.  Information interaction in molecular medicine: integrated use of multiple channels , 2010, IIiX.

[10]  Stephen E. Robertson,et al.  Extending average precision to graded relevance judgments , 2010, SIGIR.

[11]  Mark Sanderson,et al.  Test Collection Based Evaluation of Information Retrieval Systems , 2010, Found. Trends Inf. Retr..

[12]  W. Bruce Croft,et al.  Search Engines - Information Retrieval in Practice , 2009 .

[13]  G. Kazai INitiative for the Evaluation of XML Retrieval , 2009, Encyclopedia of Database Systems.

[14]  Jaana Kekäläinen,et al.  Intuition-supporting visualization of user's performance based on explicit negative higher-order relevance , 2008, SIGIR '08.

[15]  Donna K. Harman,et al.  TREC: An overview , 2006, Annu. Rev. Inf. Sci. Technol..

[16]  Peter Ingwersen,et al.  The Turn - Integration of Information Seeking and Retrieval in Context , 2005, The Kluwer International Series on Information Retrieval.

[17]  Jaana Kekäläinen,et al.  ExpansionTool: Concept-Based Query Expansion and Construction , 2001, Information Retrieval.

[18]  Pia Borlund,et al.  The concept of relevance in IR , 2003, J. Assoc. Inf. Sci. Technol..

[19]  Howard Greisdorf,et al.  Relevance thresholds: a multi-stage predictive model of how users evaluate information , 2003, Inf. Process. Manag..

[20]  T. D. Wilson,et al.  On conceptual models for information seeking and retrieval research , 2003, Inf. Res..

[21]  Jaana Kekäläinen,et al.  Using graded relevance assessments in IR evaluation , 2002, J. Assoc. Inf. Sci. Technol..

[22]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[23]  Eero Sormunen,et al.  Liberal relevance criteria of TREC -: counting on negligible documents? , 2002, SIGIR '02.

[24]  Soo Young Rieh Judgment of information quality and cognitive authority in the Web , 2002, J. Assoc. Inf. Sci. Technol..

[25]  K. Järvelin,et al.  EVALUATING INFORMATION RETRIEVAL SYSTEMS UNDER THE CHALLENGES OF INTERACTION AND MULTIDIMENSIONAL DYNAMIC RELEVANCE , 2002 .

[26]  Diane H. Sonnenwald,et al.  User perspectives on relevance criteria: A comparison among relevant, partially relevant, and not-relevant judgments , 2002, J. Assoc. Inf. Sci. Technol..

[27]  Kalervo Järvelin,et al.  Consistency of textual expression in newspaper articles: an argument for semantically based query expansion , 2001, J. Documentation.

[28]  Jaana Kekäläinen,et al.  Document text characteristics affect the ranking of the most relevant documents by expanded structured queries , 2001, J. Documentation.

[29]  Paul Over,et al.  The TREC interactive track: an annotated bibliography , 2001, Inf. Process. Manag..

[30]  Amanda Spink,et al.  Regions and levels: Measuring and mapping users' relevance judgments , 2001, J. Assoc. Inf. Sci. Technol..

[31]  Pertti Vakkari,et al.  Changes in relevance criteria and problem stages in task performance , 2000, J. Documentation.

[32]  Peter Ingwersen,et al.  Measures of relative relevance and ranked half-life: performance indicators for interactive IR , 1998, SIGIR '98.

[33]  Carol L. Barry,et al.  Users' Criteria for Relevance Evaluation: A Cross-situational Comparison , 1998, Inf. Process. Manag..

[34]  Stephen P. Harter,et al.  Evaluation of information retrieval systems : Approaches, issues, and methods , 1997 .

[35]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[36]  Kalervo Järvelin,et al.  Task Complexity Affects Information Seeking and Use , 1995, Inf. Process. Manag..

[37]  Carol L. Barry User-Defined Relevance Criteria: An Exploratory Study , 1994, J. Am. Soc. Inf. Sci..

[38]  Allen Newell,et al.  Heuristic programming: ill-structured problems , 1993 .

[39]  Jeffrey Katzer,et al.  A study of the overlap among document representations , 1983, SIGIR '83.

[40]  T. Saracevic,et al.  Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: nature and manifestations of relevance , 2007, J. Assoc. Inf. Sci. Technol..

[41]  Tefko Saracevic,et al.  RELEVANCE: A review of and a framework for the thinking on the notion in information science , 1997, J. Am. Soc. Inf. Sci..