Big Data Science: Opportunities and Challenges to Address Minority Health and Health Disparities in the 21st Century.

Addressing minority health and health disparities has been a missing piece of the puzzle in Big Data science. This article focuses on three priority opportunities that Big Data science may offer to the reduction of health and health care disparities. One opportunity is to incorporate standardized information on demographic and social determinants in electronic health records in order to target ways to improve quality of care for the most disadvantaged populations over time. A second opportunity is to enhance public health surveillance by linking geographical variables and social determinants of health for geographically defined populations to clinical data and health outcomes. Third and most importantly, Big Data science may lead to a better understanding of the etiology of health disparities and understanding of minority health in order to guide intervention development. However, the promise of Big Data needs to be considered in light of significant challenges that threaten to widen health disparities. Care must be taken to incorporate diverse populations to realize the potential benefits. Specific recommendations include investing in data collection on small sample populations, building a diverse workforce pipeline for data science, actively seeking to reduce digital divides, developing novel ways to assure digital data privacy for small populations, and promoting widespread data sharing to benefit under-resourced minority-serving institutions and minority researchers. With deliberate efforts, Big Data presents a dramatic opportunity for reducing health disparities but without active engagement, it risks further widening them.

[1]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[2]  Bin Yu,et al.  Ten Simple Rules for Effective Statistical Practice , 2016, PLoS Comput. Biol..

[3]  Jennifer S. Graff,et al.  Data, Data Everywhere, but Access Remains a Big Issue for Researchers: A Review of Access Policies for Publicly-Funded Patient-Level Health Care Data in the United States , 2016, EGEMS.

[4]  J. Rumsfeld,et al.  Big data analytics to improve cardiovascular care: promise and challenges , 2016, Nature Reviews Cardiology.

[5]  M. Hilbert,et al.  Big Data for Development: A Review of Promises and Challenges , 2016 .

[6]  D. Cox Big data and precision , 2015 .

[7]  N. Adler,et al.  Patients in context--EHR capture of social and behavioral determinants of health. , 2015, The New England journal of medicine.

[8]  W. Riley,et al.  Small is essential: importance of subpopulation research in cancer control. , 2015, American journal of public health.

[9]  B. Lewis,et al.  Methods of using real-time social media technologies for detection and remote monitoring of HIV outcomes. , 2014, Preventive medicine.

[10]  L. Gottlieb,et al.  Moving electronic medical records upstream: incorporating social determinants of health. , 2015, American journal of preventive medicine.

[11]  Robert L Wears,et al.  Big Questions for "Big Data". , 2016, Annals of emergency medicine.

[12]  Melissa A. Basford,et al.  The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future , 2013, Genetics in Medicine.

[13]  E. Perez-stable,et al.  Addressing Health Disparities Is a Place-Based Issue. , 2016, American journal of public health.

[14]  A. Zaslavsky,et al.  Racial and ethnic disparities among enrollees in Medicare Advantage plans. , 2014, The New England journal of medicine.

[15]  Bradley Malin,et al.  Design and implementation of a privacy preserving electronic health record linkage tool in Chicago , 2015, J. Am. Medical Informatics Assoc..

[16]  Elaine R Mardis The challenges of big data , 2016, Disease Models & Mechanisms.

[17]  K. Joynt,et al.  Examining Race and Ethnicity Information in Medicare Administrative Data. , 2017, Medical care.

[18]  Douglas MacFadden,et al.  Application of Information Technology The Shared Health Research Information Network ( SHRINE ) : A Prototype Federated Query Tool for Clinical Data Repositories , 2014 .

[19]  Guan Wang,et al.  A method for systematic discovery of adverse drug events from clinical notes , 2015, J. Am. Medical Informatics Assoc..

[20]  Luciano Floridi,et al.  The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts , 2015, Science and Engineering Ethics.

[21]  Eneida A. Mendonça,et al.  Relational machine learning for electronic health record-driven phenotyping , 2014, J. Biomed. Informatics.

[22]  D. D. Des Jarlais,et al.  Associations of place characteristics with HIV and HCV risk behaviors among racial/ethnic groups of people who inject drugs in the United States. , 2016, Annals of epidemiology.

[23]  Elizabeth Warren Strengthening Research through Data Sharing. , 2016, The New England journal of medicine.

[24]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[25]  Renée Boynton-Jarrett,et al.  Avoiding the Unintended Consequences of Screening for Social Determinants of Health. , 2016, JAMA.

[26]  Stephen J Gange,et al.  From Smallpox to Big Data: The Next 100 Years of Epidemiologic Methods. , 2016, American journal of epidemiology.

[27]  B. Collins Big Data and Health Economics: Strengths, Weaknesses, Opportunities and Threats , 2016, PharmacoEconomics.

[28]  Ying Chen,et al.  IBM Watson: How Cognitive Computing Can Be Applied to Big Data Challenges in Life Sciences Research. , 2016, Clinical therapeutics.

[29]  D. Bates,et al.  Big data in health care: using analytics to identify and manage high-risk and high-cost patients. , 2014, Health affairs.

[30]  Yasser El-Sonbaty,et al.  MedCloud: Healthcare cloud computing system , 2012, 2012 International Conference for Internet Technology and Secured Transactions.

[31]  I. Sim Two Ways of Knowing: Big Data and Evidence-Based Medicine , 2016, Annals of Internal Medicine.

[32]  Moon S. Chen,et al.  Twenty years post‐NIH Revitalization Act: Enhancing minority participation in clinical trials (EMPaCT): Laying the groundwork for improving minority clinical trial accrual , 2014, Cancer.

[33]  Peter Szolovits,et al.  Genetic Misdiagnoses and the Potential for Health Disparities. , 2016, The New England journal of medicine.

[34]  Behavioral Domains,et al.  Capturing Social and Behavioral Domains and Measures in Electronic Health Records: Phase 2 , 2015 .

[35]  S. Harper,et al.  Reducing social inequalities in health: the role of simulation modelling in chronic disease epidemiology to evaluate the impact of population health interventions , 2013, Journal of Epidemiology & Community Health.

[36]  J. Henry,et al.  Adoption of Electronic Health Record Systems among U . S . Non-Federal Acute Care Hospitals : 2008-2015 , 2013 .

[37]  J. Denny,et al.  Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[38]  Jules J. Berman,et al.  Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information , 2013 .

[39]  Ryen W. White,et al.  Web-scale pharmacovigilance: listening to signals from the crowd , 2013, J. Am. Medical Informatics Assoc..

[40]  Julie Miller,et al.  Big data, big deal , 2012, July/Aug 2011.

[41]  Jessica Pipersburgh The push to increase the use of EHR technology by hospitals and physicians in the United States through the HITECH Act and the Medicare incentive program. , 2011, Journal of health care finance.

[42]  William Fleischman,et al.  Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data-Driven, Machine Learning Approach. , 2016, Academic emergency medicine : official journal of the Society for Academic Emergency Medicine.

[43]  Francis S. Collins,et al.  PCORnet: turning a dream into reality , 2014, J. Am. Medical Informatics Assoc..

[44]  John Darrell Van Horn,et al.  Opinion: Big data biomedicine offers big higher education opportunities , 2016, Proceedings of the National Academy of Sciences.

[45]  David Edelman,et al.  Capitalizing on prescribing pattern variation to compare medications for type 2 diabetes. , 2014, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[46]  Chang Liu,et al.  A cloud-based framework for Home-diagnosis service over big medical data , 2015, J. Syst. Softw..

[47]  S. Anderson,et al.  The FDA's sentinel initiative—A comprehensive approach to medical product surveillance , 2016, Clinical pharmacology and therapeutics.

[48]  B. Kaneshiro,et al.  The challenges of collecting data on race and ethnicity in a diverse, multiethnic state. , 2011, Hawaii medical journal.

[49]  D. Sisti,et al.  The Ethics of Behavioral Health Information Technology: Frequent Flyer Icons and Implicit Bias. , 2016, JAMA.

[50]  Gang Luo,et al.  PredicT-ML: a tool for automating machine learning model building with big clinical data , 2016, Health Information Science and Systems.

[51]  Sherry L Pagoto,et al.  Access to Care and Use of the Internet to Search for Health Information: Results From the US National Health Interview Survey , 2015, Journal of medical Internet research.

[52]  Joshua C. Denny,et al.  Chapter 13: Mining Electronic Health Records in the Genomics Era , 2012, PLoS Comput. Biol..

[53]  Finale Doshi-Velez,et al.  Comorbidity Clusters in Autism Spectrum Disorders: An Electronic Health Record Time-Series Analysis , 2014, Pediatrics.

[54]  R. M. White Unraveling the Tuskegee Study of Untreated Syphilis. , 2000, Archives of internal medicine.

[55]  Arshdeep Bahga,et al.  A Cloud-based Approach for Interoperable Electronic Health Records (EHRs) , 2013, IEEE Journal of Biomedical and Health Informatics.

[56]  Barack Obama,et al.  United States Health Care Reform Progress to Date and Next Steps , 2016 .

[57]  Birgit Müller,et al.  Simulation Models for Socioeconomic Inequalities in Health: A Systematic Review , 2013, International journal of environmental research and public health.

[58]  L. Bergner,et al.  Low income and barriers to use of health services. , 1968, The New England journal of medicine.

[59]  J. Overhage,et al.  Advancing the Science for Active Surveillance: Rationale and Design for the Observational Medical Outcomes Partnership , 2010, Annals of Internal Medicine.

[60]  Eran Halperin,et al.  Identifying Personal Genomes by Surname Inference , 2013, Science.

[61]  E. Burchard,et al.  The Hawaii clopidogrel lawsuit: the possible effect on clinical laboratory testing. , 2015, Personalized medicine.

[62]  Rae Woong Park,et al.  Characterizing treatment pathways at scale using the OHDSI network , 2016, Proceedings of the National Academy of Sciences.

[63]  James M Robins,et al.  Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. , 2016, American journal of epidemiology.

[64]  Clemens Scott Kruse,et al.  Adoption Factors of the Electronic Health Record: A Systematic Review , 2016, JMIR medical informatics.

[65]  Joshua C. Denny,et al.  The disclosure of diagnosis codes can breach research participants' privacy , 2010, J. Am. Medical Informatics Assoc..

[66]  Marie Lynn Miranda,et al.  Methods and initial findings from the Durham Diabetes Coalition: Integrating geospatial health technology and community interventions to reduce death and disability , 2015, Journal of clinical & translational endocrinology.

[67]  Behavioral Domains,et al.  Capturing Social and Behavioral Domains in Electronic Health Records , 2014 .

[68]  Francis S Collins,et al.  National Institutes of Health addresses the science of diversity , 2015, Proceedings of the National Academy of Sciences.

[69]  Dallas Snider,et al.  IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics , 2016, JMIR public health and surveillance.

[70]  D. Sudan,et al.  Tacrolimus dose requirements in African‐American and Caucasian kidney transplant recipients on mycophenolate and prednisone , 2014, Clinical transplantation.

[71]  D. Niedzwiecki,et al.  Big Data, Small Effects. , 2016, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[72]  R. Altman,et al.  Detecting Drug Interactions From Adverse‐Event Reports: Interaction Between Paroxetine and Pravastatin Increases Blood Glucose Levels , 2011, Clinical pharmacology and therapeutics.

[73]  A. Toga,et al.  Wrangling Big Data Through Diversity, Research Education and Partnerships. , 2015, Californian journal of health promotion.

[74]  Jake Luo,et al.  Big Data Application in Biomedical Research and Health Care: A Literature Review , 2016, Biomedical informatics insights.