Internet Search Patterns of Human Immunodeficiency Virus and the Digital Divide in the Russian Federation: Infoveillance Study

Background Human immunodeficiency virus (HIV) is a serious health problem in the Russian Federation. However, the true scale of HIV in Russia has long been the subject of considerable debate. Using digital surveillance to monitor diseases has become increasingly popular in high income countries. But Internet users may not be representative of overall populations, and the characteristics of the Internet-using population cannot be directly ascertained from search pattern data. This exploratory infoveillance study examined if Internet search patterns can be used for disease surveillance in a large middle-income country with a dispersed population. Objective This study had two main objectives: (1) to validate Internet search patterns against national HIV prevalence data, and (2) to investigate the relationship between search patterns and the determinants of Internet access. Methods We first assessed whether online surveillance is a valid and reliable method for monitoring HIV in the Russian Federation. Yandex and Google both provided tools to study search patterns in the Russian Federation. We evaluated the relationship between both Yandex and Google aggregated search patterns and HIV prevalence in 2011 at national and regional tiers. Second, we analyzed the determinants of Internet access to determine the extent to which they explained regional variations in searches for the Russian terms for “HIV” and “AIDS”. We sought to extend understanding of the characteristics of Internet searching populations by data matching the determinants of Internet access (age, education, income, broadband access price, and urbanization ratios) and searches for the term “HIV” using principal component analysis (PCA). Results We found generally strong correlations between HIV prevalence and searches for the terms “HIV” and “AIDS”. National correlations for Yandex searches for “HIV” were very strongly correlated with HIV prevalence (Spearman rank-order coefficient [rs]=.881, P≤.001) and strongly correlated for “AIDS” (rs=.714, P≤.001). The strength of correlations varied across Russian regions. National correlations in Google for the term “HIV” (rs=.672, P=.004) and “AIDS” (rs=.584, P≤.001) were weaker than for Yandex. Second, we examined the relationship between the determinants of Internet access and search patterns for the term “HIV” across Russia using PCA. At the national level, we found Principal Component 1 loadings, including age (-0.56), HIV search (-0.533), and education (-0.479) contributed 32% of the variance. Principal Component 2 contributed 22% of national variance (income, -0.652 and broadband price, -0.460). Conclusions This study contributes to the methodological literature on search patterns in public health. Based on our preliminary research, we suggest that PCA may be used to evaluate the relationship between the determinants of Internet access and searches for health problems beyond high-income countries. We believe it is in middle-income countries that search methods can make the greatest contribution to public health.

[1]  E. Drucker,et al.  On drug treatment and social control: Russian narcology's great leap backwards , 2008, Harm reduction journal.

[2]  David M. Pennock,et al.  Using internet searches for influenza surveillance. , 2008, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[3]  Debbie A Lawlor,et al.  Measuring socio-economic position for epidemiological studies in low- and middle-income countries: a methods of measurement in epidemiology paper , 2012, International journal of epidemiology.

[4]  R. Atun,et al.  Sex, drugs and economic behaviour in Russia: a study of socio-economic characteristics of high risk populations. , 2011, The International journal on drug policy.

[5]  T. Rhodes,et al.  Policing Drug Users in Russia: Risk, Fear, and Structural Violence , 2010, Substance use & misuse.

[6]  Eleftherios Mylonakis,et al.  Google trends: a web-based tool for real-time surveillance of disease outbreaks. , 2009, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[7]  J. Kronenfeld,et al.  Chronic illness and health-seeking information on the Internet , 2007, Health.

[8]  T. Bernardo,et al.  Scoping Review on Search Queries and Social Media for Disease Surveillance: A Chronology of Innovation , 2013, Journal of medical Internet research.

[9]  S. Rutherford,et al.  Using Google Trends for Influenza Surveillance in South China , 2013, PloS one.

[10]  Gunther Eysenbach,et al.  Infodemiology and infoveillance tracking online health information and cyberbehavior for public health. , 2011, American journal of preventive medicine.

[11]  E. Gabrilovich,et al.  Postmarket Drug Surveillance Without Trial Costs: Discovery of Adverse Drug Reactions Through Large-Scale Analysis of Web Search Queries , 2013, Journal of medical Internet research.

[12]  Katharine Armstrong,et al.  Big data: a revolution that will transform how we live, work, and think , 2014 .

[13]  I T Joliffe,et al.  Principal component analysis and exploratory factor analysis , 1992, Statistical methods in medical research.

[14]  Kenneth Flamm,et al.  An analysis of the determinants of internet access , 2005 .

[15]  J. Hamel,et al.  ICT4D and the human development and capabilities approach: the potentials of information and communication technology , 2010 .

[16]  D. Moran,et al.  HIV/AIDS in Russia: determinants of regional prevalence , 2007, International journal of health geographics.

[17]  Gill Kirkup,et al.  Gender and cultural differences in Internet use: A study of China and the UK , 2007, Comput. Educ..

[18]  H. Varian,et al.  Predicting the Present with Google Trends , 2012 .

[19]  R. Heimer,et al.  Estimates of HIV incidence among drug users in St. Petersburg, Russia: continued growth of a rapidly expanding epidemic , 2010, European journal of public health.

[20]  Ronald E. Rice,et al.  Influences, usage, and outcomes of Internet health information searching: Multivariate results from the Pew surveys , 2006, Int. J. Medical Informatics.

[21]  María Rosalía Vicente,et al.  Some empirical evidence on Internet diffusion in the New Member States and Candidate Countries of the European Union , 2008 .

[22]  M. Golichenko,et al.  Atmospheric pressure: Russian drug policy as a driver for violations of the UN Convention against Torture and the International Covenant on Economic, Social and Cultural Rights. , 2013, Health and human rights.

[23]  J. Brownstein,et al.  New technologies for reporting real-time emergent infections , 2012, Parasitology.

[24]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[25]  S. Kalichman,et al.  Health information on the Internet and people living with HIV/AIDS: information evaluation and coping styles. , 2006, Health psychology : official journal of the Division of Health Psychology, American Psychological Association.

[26]  C. Chronaki,et al.  European citizens' use of E-health services: A study of seven countries , 2007, BMC public health.

[27]  S. Stephens-Davidowitz,et al.  The Cost of Racial Animus on a Black Presidential Candidate: Using Google Search Data to Find What Surveys Miss , 2013 .

[28]  M. Chinn,et al.  ICT Use in the Developing World: An Analysis of Differences in Computer and Internet Penetration , 2006, SSRN Electronic Journal.

[29]  Charles C. Hinnant,et al.  Exploring digital divides: An examination of eHealth technology use in health information seeking, communication and personal health information management in the USA , 2011, Health Informatics J..

[30]  Fernando Lera-López,et al.  Determinants of Internet use in Spain , 2011 .

[31]  S. Cotten,et al.  Characteristics of online and offline health information seekers and factors that discriminate between them. , 2004, Social science & medicine.

[32]  C. la Vecchia,et al.  Environmental factors and cancer mortality in Italy: correlational exercise. , 1986, Oncology.

[33]  J. Martin-Moreno,et al.  Quo vadis SANEPID? A cross-country analysis of public health reforms in 10 post-Soviet states. , 2011, Health policy.

[34]  Michael J Kerin,et al.  The effect of breast cancer awareness month on internet search activity - a comparison with awareness campaigns for lung and prostate cancer , 2011, BMC Cancer.

[35]  K. Gabriel,et al.  The biplot graphic display of matrices with application to principal component analysis , 1971 .

[36]  C. Qiang,et al.  Economic Impacts of Broadband , 2009 .

[37]  L. McNutt,et al.  Bias in medicine: a survey of medical student attitudes towards HIV-positive and marginalized patients in Russia, 2010 , 2012, Journal of the International AIDS Society.

[38]  Walter H. Curioso,et al.  Access, use and perceptions regarding Internet, cell phones and PDAs as a means for health promotion for people living with HIV in Peru , 2007, BMC Medical Informatics Decis. Mak..

[39]  S. Willard,et al.  Internet search trends analysis tools can provide real-time data on kidney stone disease in the United States. , 2013, Urology.

[40]  S. Kalichman,et al.  Internet use among people living with HIV/AIDS: association of health information, health behaviors, and health status. , 2002, AIDS education and prevention : official publication of the International Society for AIDS Education.

[41]  Lipika Samal,et al.  Internet health information seeking behavior and antiretroviral adherence in persons living with HIV/AIDS. , 2011, AIDS patient care and STDs.

[42]  J. Brownstein,et al.  Digital disease detection--harnessing the Web for public health surveillance. , 2009, The New England journal of medicine.

[43]  R. Hornik,et al.  Cancer News Coverage and Information Seeking , 2008, Journal of health communication.

[44]  M. Cetron,et al.  Infectious disease surveillance and modelling across geographic frontiers and scientific specialties. , 2012, The Lancet. Infectious diseases.

[45]  W. Miller,et al.  HIV incidence and factors associated with HIV acquisition among injection drug users in St Petersburg, Russia , 2006, AIDS.

[46]  Emily H. Chan,et al.  Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance , 2011, PLoS neglected tropical diseases.

[47]  Umar Saif,et al.  FluBreaks: early epidemic detection from Google flu trends. , 2012, Journal of medical Internet research.

[48]  Jon-Patrick Allem,et al.  A Novel Evaluation of World No Tobacco Day in Latin America , 2012, Journal of medical Internet research.

[49]  P. Lawson,et al.  Federal Communications Commission , 2004, Bell Labs Technical Journal.

[50]  Pinar Karaca-Mandic,et al.  Predicting new diagnoses of HIV infection using internet search engine data. , 2013, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[51]  C. Vecchia,et al.  Environmental Factors and Cancer Mortality in Italy: Correlational Excercise , 1986 .

[52]  James A Gillespie,et al.  Searching for Truth: Internet Search Patterns as a Method of Investigating Online Responses to a Russian Illicit Drug Policy Debate , 2012, Journal of medical Internet research.

[53]  Ryen W. White,et al.  Web-scale pharmacovigilance: listening to signals from the crowd , 2013, J. Am. Medical Informatics Assoc..

[54]  Dylan B. George,et al.  Big Data Opportunities for Global Infectious Disease Surveillance , 2013, PLoS medicine.

[55]  Martin Hilbert,et al.  The end justifies the definition: The manifold outlooks on the digital divide and their practical usefulness for policy-making , 2011 .

[56]  D. Pelleg,et al.  Patterns of Information-Seeking for Cancer on the Internet: An Analysis of Real World Data , 2012, PloS one.

[57]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[58]  K. Fu,et al.  Accessing Suicide-Related Information on the Internet: A Retrospective Observational Study of Search Behavior , 2013, Journal of medical Internet research.

[59]  Laura A. Granka,et al.  Inferring the Public Agenda from Implicit Query Data , 2009, UIIR@SIGIR.

[60]  J. Ayers,et al.  Seasonality in seeking mental health information on Google. , 2013, American journal of preventive medicine.

[61]  G. Eysenbach Infodemiology and Infoveillance: Framework for an Emerging Set of Public Health Informatics Methods to Analyze Search, Communication and Publication Behavior on the Internet , 2009, Journal of medical Internet research.

[62]  Jason W. Osborne,et al.  Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis. , 2005 .

[63]  Nicholas Giannakopoulos,et al.  Facets of the digital divide in Europe: Determination and extent of internet use , 2006 .

[64]  R. Atun,et al.  Stigma and HIV infection in Russia , 2006, AIDS care.

[65]  R. Heimer,et al.  Estimation of the number of injection drug users in St. Petersburg, Russia. , 2010, Drug and alcohol dependence.

[66]  N. Selwyn,et al.  Is It Only About Internet Access? An Empirical Test of a Multi-dimensional Digital Divide , 2006, EGOV.

[67]  Taha Kass-Hout,et al.  A New Approach to Monitoring Dengue Activity , 2011, PLoS neglected tropical diseases.

[68]  Gunther Eysenbach,et al.  Infodemiology: Tracking Flu-Related Searches on the Web for Syndromic Surveillance , 2006, AMIA.

[69]  D. Panagiotakos,et al.  Dietary patterns in relation to socio-economic and lifestyle characteristics among Greek adolescents: a multivariate analysis , 2009, Public Health Nutrition.

[70]  David McDaid,et al.  Online health: untangling the web , 2010 .

[71]  J. Kelly,et al.  People with HIV in HAART-Era Russia: Transmission Risk Behavior Prevalence, Antiretroviral Medication-Taking, and Psychosocial Distress , 2011, AIDS and Behavior.

[72]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[73]  Alan R. Peslak An analysis of regional and demographic differences in United States Internet usage , 2004, First Monday.