Search and Breast Cancer : On Disruptive Shifts of Attention over Life Histories of an Illness

We seek to understand the evolving needs of people who are faced with a life-changing medical diagnosis based on analyses of queries extracted from an anonymized search query log. Focusing on breast cancer, we manually tag a set of Web searchers as showing disruptive shifts in focus of attention and long-term patterns of search behavior consistent with the diagnosis and treatment of breast cancer. We build and apply probabilistic classifiers to detect these searchers from multiple sessions and to detect the timing of diagnosis, using a variety of temporal and statistical features. We explore the changes in information-seeking over time before and after an inferred diagnosis of breast cancer by aligning multiple searchers by the likely time of diagnosis. We automatically identify 1700 candidate searchers with an estimated 90% precision, and we predict the day of diagnosis within 15 days with an 88% accuracy. We show that the geographic and demographic attributes of searchers identified with high probability are strongly correlated with ground truth of reported incidence rates. We then analyze the content of queries over time from searchers for whom diagnosis was predicted, using a detailed ontology of cancerrelated search terms. Our analysis reveals the rich temporal structure of the evolving queries of people likely diagnosed with breast cancer. Finally, we focus on subtypes of illness based on inferred stages of cancer and show clinically relevant dynamics of information seeking based on dominant stage expressed by searchers.

[1]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[2]  J. Kronenfeld,et al.  Chronic illness and health-seeking information on the Internet , 2007, Health.

[3]  Marc-Allen Cartright,et al.  Intentions and attention in exploratory health search , 2011, SIGIR.

[4]  K. Mccaul,et al.  Information Gathering Over Time by Breast Cancer Patients , 2003, Journal of medical Internet research.

[5]  Ryen W. White,et al.  From health search to healthcare: explorations of intention and utilization via query logs and user surveys , 2014, J. Am. Medical Informatics Assoc..

[6]  Matthew I. Trotter,et al.  Patients' use of the Internet for health related matters: a study of Internet usage in 2000 and 2006 , 2008, Health Informatics J..

[7]  Feng Gao,et al.  A survey of Internet utilization among patients with cancer , 2011, Supportive Care in Cancer.

[8]  K C Carriere,et al.  Information needs and decisional preferences in women with breast cancer. , 1997, JAMA.

[9]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[10]  H. Thornton,et al.  Ductal carcinoma-in-situ of the breast , 1992, The Lancet.

[11]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[12]  Emily H. Chan,et al.  Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance , 2011, PLoS neglected tropical diseases.

[13]  Pierre Pluye,et al.  Shortcomings of health information on the Internet. , 2003, Health promotion international.

[14]  Ryen W. White,et al.  WWW 2007 / Track: Browsers and User Interfaces Session: Personalization Investigating Behavioral Variability in Web Search , 2022 .

[15]  Jon-Patrick Allem,et al.  A Novel Evaluation of World No Tobacco Day in Latin America , 2012, Journal of medical Internet research.

[16]  R. J. Cline,et al.  Consumer health information seeking on the Internet: the state of the art. , 2001, Health education research.

[17]  Ryen W. White,et al.  From web search to healthcare utilization: privacy-sensitive studies from mobile data , 2013, J. Am. Medical Informatics Assoc..

[18]  J. Rowland,et al.  Information needs and sources of information among cancer patients: a systematic review of research (1980-2003). , 2005, Patient education and counseling.

[19]  Michael J. Paul Mixed Membership Markov Models for Unsupervised Conversation Modeling , 2012, EMNLP.

[20]  Geoffrey Mitchell,et al.  Information giving and decision-making in patients with advanced cancer: a systematic review. , 2005, Social science & medicine.

[21]  Ryen W. White,et al.  Modeling and analysis of cross-session search tasks , 2011, SIGIR.

[22]  C. Lohrisch Time to Adjuvant Chemotherapy for Breast Cancer in National Comprehensive Cancer Network Institutions , 2013 .

[23]  Ryen W. White,et al.  Cyberchondria: Studies of the escalation of medical concerns in Web search , 2009, TOIS.

[24]  D. Pelleg,et al.  Patterns of Information-Seeking for Cancer on the Internet: An Analysis of Real World Data , 2012, PloS one.

[25]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[26]  Paul R Helft Patients with cancer, internet information, and the clinical encounter: a taxonomy of patient users. , 2012, American Society of Clinical Oncology educational book. American Society of Clinical Oncology. Annual Meeting.

[27]  Ryen W. White,et al.  Web-scale pharmacovigilance: listening to signals from the crowd , 2013, J. Am. Medical Informatics Assoc..

[28]  Kylie A. Williams,et al.  How do Consumers Search for and Appraise Information on Medicines on the Internet? A Qualitative Study Using Focus Groups , 2003, Journal of medical Internet research.

[29]  S. Ziebland,et al.  How the internet affects patients' experience of cancer: a qualitative study , 2004, BMJ : British Medical Journal.

[30]  Doug Downey,et al.  Models of Searching and Browsing: Languages, Studies, and Application , 2007, IJCAI.

[31]  Dana Nickleach,et al.  Factors influencing time to diagnosis after abnormal mammography in diverse women. , 2013, Journal of women's health.