The influence of document characteristics on the quality of health web documents

The quality of consumer-oriented health information on the Web is usually assessed through the medical certification of websites. These tools are built upon quality indicators but, so far, no standard set of indicators has been defined. The objective of the present study is to explore the popularity of specific document features and their influence on the quality of health web documents, using HON code as ground truth. A set of top-ranked health documents retrieved from a major search engine was characterized in a univariate analysis, and then used in a bivariate analysis to seek features that affect documents' quality. The univariate analysis provides insights into the characteristics of the overall population of the health web documents. The bivariate analysis reveals strong relations between documents' quality and a set of features (namely split content, videos, images, advertisements, English language) that are potential quality indicators. We characterized health web documents and identified specific document features that can be used to assess whether the information in such documents is trustworthy. The main contribution of this work is to provide other features as candidate indicators of quality. Non-health professionals can use these indicators in automatic and manual assessments of health content.

[1]  Elmer V. Bernstam,et al.  Instruments to assess the quality of health information on the World Wide Web: what can our patients actually use? , 2005, Int. J. Medical Informatics.

[2]  J. Powell,et al.  Empirical studies assessing the quality of health information for consumers on the world wide web: a systematic review. , 2002, JAMA.

[3]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[4]  G D Lundberg,et al.  Assessing, controlling, and assuring the quality of medical information on the Internet: Caveant lector et viewor--Let the reader and viewer beware. , 1997, JAMA.

[5]  M Pallen,et al.  Guide to the Internet: The world wide web , 1995, BMJ.

[6]  Paulo Teixeira,et al.  A model for the evaluation of data quality in health unit websites , 2016, Health Informatics J..

[7]  J. Burkell,et al.  Health Information Seals of Approval: What do they Signify? , 2004 .

[8]  Reijo Savolainen,et al.  Source preferences in the context of seeking problem-specific information , 2008, Inf. Process. Manag..

[9]  P. Impicciatore,et al.  Reliability of health information for the public on the world wide web: systematic survey of advice on managing fever in children at home , 1997, BMJ.

[10]  Christian Köhler,et al.  How do consumers search for and appraise health information on the world wide web? Qualitative study using focus groups, usability tests, and in-depth interviews , 2002, BMJ : British Medical Journal.

[11]  Carla Teixeira Lopes,et al.  Context effect on query formulation and subjective relevance in health searches , 2010, IIiX.

[12]  Ashish Joshi,et al.  Evaluation of dengue-related health information on the internet. , 2012, Perspectives in health information management.

[13]  M. Coleman,et al.  A computer readability formula designed for machine scoring. , 1975 .

[14]  William R. Hersh,et al.  Filtering Web pages for quality indicators: an empirical approach to finding high quality consumer health information on the World Wide Web , 1999, AMIA.

[15]  R. Kravitz,et al.  Health information on the Internet: accessibility, quality, and readability in English and Spanish. , 2001, JAMA.

[16]  C. Sunstein,et al.  Does More Speech Correct Falsehoods? , 2013, The Journal of Legal Studies.

[17]  Nicholas J. Belkin,et al.  Predicting users' domain knowledge in information retrieval using multiple regression analysis of search behaviors , 2015, J. Assoc. Inf. Sci. Technol..

[18]  Carla Teixeira Lopes,et al.  Measuring the value of health query translation: An analysis by user language proficiency , 2013, J. Assoc. Inf. Sci. Technol..

[19]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[20]  E. Fahy,et al.  Quality of patient health information on the Internet: reviewing a complex and evolving landscape. , 2014, The Australasian medical journal.

[21]  Dan Roth,et al.  Overcoming bias to learn about controversial topics , 2015, J. Assoc. Inf. Sci. Technol..