Understanding and Mitigating Bias in Online Health Search

Search engines are perceived as a reliable source for general information needs. However, finding the answer to medical questions using search engines can be challenging for an ordinary user. Content can be biased and results may present different opinions. In addition, interpreting medically related content can be difficult for users with no medical background. All of these can lead users to incorrect conclusions regarding health related questions. In this work we address this problem from two perspectives. First, to gain insight on users' ability to correctly answer medical questions using search engines, we conduct a comprehensive user study. We show that for questions regarding medical treatment effectiveness, participants struggle to find the correct answer and are prone to overestimating treatment effectiveness. We analyze participants' demographic traits according to age and education level and show that this problem persists in all demographic groups. We then propose a semi-automatic machine learning approach to find the correct answer to queries on medical treatment effectiveness as it is viewed by the medical community. The model relies on the opinions presented in medical papers related to the queries, as well as features representing their impact. We show that, compared to human behaviour, our method is less prone to bias. We compare various configurations of our inference model and a baseline method that determines treatment effectiveness based solely on the opinion of medical papers. The results bolster our confidence that our approach can pave the way to developing automatic bias-free tools that can help mediate complex health related content to users.

[1]  Jimmy J. Lin,et al.  PageRank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval , 2008, BMC Bioinformatics.

[2]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[3]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[4]  L. Suter Educational Attainment in the United States: March 1977 and 1976. Current Population Reports, Population Characteristics, Series P-20, No. 314. , 1977 .

[5]  Ryen W. White Beliefs and biases in web search , 2013, SIGIR.

[6]  Charles L. A. Clarke,et al.  The Positive and Negative Influence of Search Results on People's Decisions about the Efficacy of Medical Treatments , 2017, ICTIR.

[7]  Elliot J. Yates,et al.  PageRank as a method to rank biomedical literature by importance , 2015, Source Code for Biology and Medicine.

[8]  Pol Mac Aonghusa,et al.  The Human Behaviour-Change Project: harnessing the power of artificial intelligence and machine learning for evidence synthesis and interpretation , 2017, Implementation Science.

[9]  Yelena Mejova,et al.  Fake Cures , 2018, Proc. ACM Hum. Comput. Interact..

[10]  M. Mckee,et al.  Systematic Literature Review on the Spread of Health-related Misinformation on Social Media , 2019, Social Science & Medicine.

[11]  Hong Yu,et al.  Beyond Information Retrieval - Medical Question Answering , 2006, AMIA.

[12]  Patty Kostkova,et al.  VAC Medi+board: Analysing Vaccine Rumours in News and Social Media , 2016, Digital Health.

[13]  Ryen W. White,et al.  Content Bias in Online Health Search , 2014, TWEB.

[14]  Yelena Mejova,et al.  Catching Zika Fever: Application of Crowdsourcing and Machine Learning for Tracking Health Misinformation on Twitter , 2017, 2017 IEEE International Conference on Healthcare Informatics (ICHI).

[15]  Bei Yu,et al.  Crowdsourcing Participatory Evaluation of Medical Pictograms Using Amazon Mechanical Turk , 2013, Journal of medical Internet research.

[16]  Pierre Zweigenbaum,et al.  MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies , 2015, Inf. Process. Manag..

[17]  Yelena Mejova,et al.  Fake Cures: User-centric Modeling of Health Misinformation in Social Media , 2018 .

[18]  Todd Lingren,et al.  Web 2.0-Based Crowdsourcing for High-Quality Gold Standard Development in Clinical Natural Language Processing , 2013, Journal of medical Internet research.

[19]  Lutz Bornmann,et al.  What do we know about the h index? , 2007, J. Assoc. Inf. Sci. Technol..

[20]  Barry Bayus,et al.  Crowdsourcing in medical research: concepts and applications , 2019, PeerJ.