Predicting Consumer Familiarity with Health Topics by Query Formulation and Search Result Interaction

Searching for understandable health information on the Internet remains difficult for most consumers. Every consumer has different health topic familiarity. This diversity may cause misunderstanding because the information presented during health information searches may not fit the consumer’s understanding. This study aimed to develop health topic familiarity prediction models based on the consumer’s searching behavior, how the consumers formulate the query and how they interact with the search results. The experimental results show that Naive Bayes and Sequential Minimal Optimization classifiers achieved high accuracy on the combination of query formulation and search result interaction feature sets in predicting consumer’s health topic familiarity. This finding suggests that health topic familiarity identification based on the query formulation and the search result interaction is feasible and effective.

[1]  Gang Luo,et al.  Design and Evaluation of the iMed Intelligent Medical Search Engine , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[2]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[3]  Soo Young Rieh,et al.  Analysis of multiple query reformulations on the web: The interactive information retrieval context , 2006, Information Processing & Management.

[4]  Aziz A. Boxwala,et al.  Research Paper: Estimating Consumer Familiarity with Health Terminology: A Context-based Approach , 2008, J. Am. Medical Informatics Assoc..

[5]  Alla Keselman,et al.  Assessing Consumer Health Vocabulary Familiarity: An Exploratory Study , 2007, Journal of medical Internet research.

[6]  Qing Zeng-Treitler,et al.  Exploring and developing consumer health vocabularies. , 2006, Journal of the American Medical Informatics Association : JAMIA.

[7]  P. Fitzsimmons,et al.  A readability assessment of online Parkinson's disease information. , 2010, The journal of the Royal College of Physicians of Edinburgh.

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[10]  Qing Zeng-Treitler,et al.  A Text Corpora-Based Estimation of the Familiarity of Health Terminology , 2005, ISBMDA.

[11]  Mark V. Williams,et al.  Test of Functional Health Literacy in Adults , 2016 .