Using the Characteristics of Documents, Users and Tasks to Predict the Situational Relevance of Health Web Documents

Relevance is usually estimated by search engines using document content, disregarding the user behind the search and the characteristics of the task. In this work, we look at relevance as framed in a situational context, calling it situational relevance, and analyze whether it is possible to predict it using documents, users and tasks characteristics. Using an existing dataset composed of health web documents, relevance judgments for information needs, user and task characteristics, we build a multivariate prediction model for situational relevance. Our model has an accuracy of 77.17%. Our findings provide insights into features that could improve the estimation of relevance by search engines, helping to conciliate the systemic and situational views of relevance. In a near future we will work on the automatic assessment of document, user and task characteristics.

[1]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[2]  Jane Greenberg,et al.  Relevance criteria identified by health information users during Web searches: Research Articles , 2006 .

[3]  Carla Teixeira Lopes,et al.  Predicting the situational relevance of health web documents , 2017, 2017 12th Iberian Conference on Information Systems and Technologies (CISTI).

[4]  M Pallen,et al.  Guide to the Internet: The world wide web , 1995, BMJ.

[5]  Paulo Teixeira,et al.  A model for the evaluation of data quality in health unit websites , 2016, Health Informatics J..

[6]  Carla Teixeira Lopes,et al.  The Influence of Documents, Users and Tasks on the Relevance and Comprehension of Health Web Documents , 2015 .

[7]  Gareth J. F. Jones,et al.  ShARe/CLEF eHealth Evaluation Lab 2014, Task 3: User-centred Health Information Retrieval , 2014, CLEF.

[8]  Carla Teixeira Lopes,et al.  Context effect on query formulation and subjective relevance in health searches , 2010, IIiX.

[9]  Marcia J. Bates Understanding Information Retrieval Systems : Management, Types, and Standards , 2011 .

[10]  Sanna Salanterä,et al.  Building realistic potential patient queries for medical information retrieval evaluation , 2014 .

[11]  Carla Teixeira Lopes,et al.  Measuring the value of health query translation: An analysis by user language proficiency , 2013, J. Assoc. Inf. Sci. Technol..

[12]  Guido Zuccon,et al.  The IR Task at the CLEF eHealth Evaluation Lab 2016: User-centred Health Information Retrieval , 2016, CLEF.

[13]  Joemon M. Jose,et al.  How users assess Web pages for information seeking , 2005, J. Assoc. Inf. Sci. Technol..

[14]  Gareth J. F. Jones,et al.  CLEF eHealth Evaluation Lab 2015, Task 2: Retrieving Information About Medical Symptoms , 2015, CLEF.

[15]  Saul Vargas,et al.  Explicit relevance models in intent-oriented information retrieval diversification , 2012, SIGIR '12.

[16]  Mark J. Pallen Guide to the Internet , 1996 .

[17]  Susan T. Dumais,et al.  Evaluation Challenges and Directions for Information-Seeking Support Systems , 2009, Computer.

[18]  Jarkko Kari,et al.  User-defined relevance criteria in web searching , 2006, J. Documentation.

[19]  Jeonghyun Kim,et al.  Describing and predicting information-seeking behavior on the Web , 2009, J. Assoc. Inf. Sci. Technol..

[20]  Christine Marton,et al.  A review of theoretical models of health information seeking on the web , 2012, J. Documentation.

[21]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[22]  Reijo Savolainen,et al.  Source preferences in the context of seeking problem-specific information , 2008, Inf. Process. Manag..

[23]  Jin Zhang,et al.  Multidimensional relevance modeling via psychometrics and crowdsourcing , 2014, SIGIR.