Measuring the value of health query translation: An analysis by user language proficiency

English is by far the most used language on the web. In some domains, the existence of less content in the users' native language may not be problematic and even help to cope with the information overload. Yet, in domains such as health, where information quality is critical, a larger quantity of information may mean easier access to higher quality content. Query translation may be a good strategy to access content in other languages, but the presence of medical terms in health queries makes the translation process more difficult, even for users with very good language proficiencies. In this study, we evaluate how translating a health query affects users with different language proficiencies. We chose English as the non‐native language because it is a widely spoken language and it is the most used language on the web. Our findings suggest that non‐English–speaking users having at least elementary English proficiency can benefit from a system that suggests English alternatives for their queries, or automatically retrieves English content from a non‐English query. This awareness of the user profile results in higher precision, more accurate medical knowledge, and better access to high‐quality content. Moreover, the suggestions of English‐translated queries may also trigger new health search strategies.

[1]  William R. Hersh,et al.  SAPHIRE International: a tool for cross-language information retrieval , 1998, AMIA.

[2]  Ari Pirkola,et al.  The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval , 1998, SIGIR '98.

[3]  Padmini Srinivasan,et al.  Cross-language information retrieval with the UMLS metathesaurus , 1998, SIGIR '98.

[4]  Gregory Grefenstette,et al.  Estimation of English and non-English Language Use on the WWW , 2000, RIAO.

[5]  R. J. Cline,et al.  Consumer health information seeking on the Internet: the state of the art. , 2001, Health education research.

[6]  Paul Buitelaar,et al.  Semantic annotation for concept-based cross-language medical information retrieval , 2002, Int. J. Medical Informatics.

[7]  William R. Hersh,et al.  Information Retrieval: A Health and Biomedical Perspective , 2002 .

[8]  Pia Borlund,et al.  The IIR evaluation model: a framework for evaluation of interactive information retrieval systems , 2003, Inf. Res..

[9]  Allen C. Browne,et al.  Machine Translation-Supported Cross-Language Information Retrieval for a Consumer Health Resource , 2003, AMIA.

[10]  Anita Burgun-Parenthoine,et al.  Experiments in cross-language medical information retrieval using a mixing translation module , 2004, MedInfo.

[11]  Wen-Hsiang Lu,et al.  Semi-Automatic Construction of the Chinese-English MeSH Using Web-BasedTerm Translation Method , 2005, AMIA.

[12]  Lina Fatima Soualmia,et al.  A method of cross-lingual consumer health information retrieval , 2006, MIE.

[13]  P. Lewis Ethnologue : languages of the world , 2009 .

[14]  P. Fitzsimmons,et al.  A readability assessment of online Parkinson's disease information. , 2010, The journal of the Royal College of Physicians of Edinburgh.

[15]  Stephen E. Robertson,et al.  Extending average precision to graded relevance judgments , 2010, SIGIR.

[16]  Carla Teixeira Lopes,et al.  Data Certification Impact on Health Information Retrieval , 2011, USAB.

[17]  Antoine Geissbühler,et al.  Evolution of Health Web certification through the HONcode experience , 2011, MIE.