Research Paper: Estimating Consumer Familiarity with Health Terminology: A Context-based Approach

OBJECTIVES Effective health communication is often hindered by a "vocabulary gap" between language familiar to consumers and jargon used in medical practice and research. To present health information to consumers in a comprehensible fashion, we need to develop a mechanism to quantify health terms as being more likely or less likely to be understood by typical members of the lay public. Prior research has used approaches including syllable count, easy word list, and frequency count, all of which have significant limitations. DESIGN In this article, we present a new method that predicts consumer familiarity using contextual information. The method was applied to a large query log data set and validated using results from two previously conducted consumer surveys. MEASUREMENTS We measured the correlation between the survey result and the context-based prediction, syllable count, frequency count, and log normalized frequency count. RESULTS The correlation coefficient between the context-based prediction and the survey result was 0.773 (p < 0.001), which was higher than the correlation coefficients between the survey result and the syllable count, frequency count, and log normalized frequency count (p < or = 0.012). CONCLUSIONS The context-based approach provides a good alternative to the existing term familiarity assessment methods.

[1]  Mark V. Williams,et al.  Test of Functional Health Literacy in Adults , 2016 .

[2]  Noémie Elhadad Comprehending Technical Texts: Predicting and Defining Unfamiliar Terms , 2006, AMIA.

[3]  Betsy L. Humphreys,et al.  Technical Milestone: The Unified Medical Language System: An Informatics Research Collaboration , 1998, J. Am. Medical Informatics Assoc..

[4]  Charles Abraham,et al.  Lay understanding of terms used in cancer consultations , 2003, Psycho-oncology.

[5]  E. Lerner,et al.  Medical communication: do our patients understand? , 2000, The American journal of emergency medicine.

[6]  J. Chall,et al.  Readability revisited : the new Dale-Chall readability formula , 1995 .

[7]  E. J. Mayeaux,et al.  Rapid estimate of adult literacy in medicine: a shortened screening instrument. , 1993, Family medicine.

[8]  Maciej Ceglowski,et al.  Semantic Search of Unstructured Data using Contextual Network Graphs , 2003 .

[9]  Jane Ogden,et al.  What's in a name? An experimental study of patients' views of the impact and function of a diagnosis. , 2003, Family practice.

[10]  Qing Zeng-Treitler,et al.  A Text Corpora-Based Estimation of the Familiarity of Health Terminology , 2005, ISBMDA.

[11]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[12]  Ian Witten,et al.  Data Mining , 2000 .

[13]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[14]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[15]  Graciela Rosemblat,et al.  Assessing Readability of Consumer Health Information: An Exploratory Study , 2004, MedInfo.

[16]  Alla Keselman,et al.  Assessing Consumer Health Vocabulary Familiarity: An Exploratory Study , 2007, Journal of medical Internet research.

[17]  Q. Zeng,et al.  Exploring and Developing Consumer Health Vocabularies , 2005 .

[18]  Alla Keselman,et al.  The Effect of User Factors on Consumer Familiarity with Health Terms: Using Gender as a Proxy for Background Knowledge About Gender-Specific Illnesses , 2006, ISBMDA.

[19]  M F Weiner,et al.  'Patientspeak': an exercise in communication. , 1984, Journal of medical education.

[20]  Qing Zeng-Treitler,et al.  Research Paper: Assisting Consumer Health Information Retrieval with Query Recommendations , 2006, J. Am. Medical Informatics Assoc..

[21]  Improvement,et al.  Adult literacy in America : a first look at the results of the National Adult Literacy Survey , 1993 .

[22]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[23]  Helen Osborne,et al.  Health Literacy from A to Z: Practical Ways to Communicate Your Health Message , 2004 .

[24]  Ian Davidson,et al.  Using Background Contextual Knowledge for Document Representation , 1996, PODP.

[25]  J E Backus,et al.  MEDLINEplus: building and maintaining the National Library of Medicine's consumer health Web service. , 2000, Bulletin of the Medical Library Association.

[26]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .