Toward a Literature-Driven Definition of Big Data in Healthcare

Objective. The aim of this study was to provide a definition of big data in healthcare. Methods. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. Results. A total of 196 papers were included. Big data can be defined as datasets with Log⁡(n∗p) ≥ 7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. Conclusion. Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR) data.

[1]  D. Mohr,et al.  Behavioral intervention technologies: evidence review and recommendations for future research in mental health. , 2013, General hospital psychiatry.

[2]  Steven Bonney HIM's role in managing big data: Turning data collected by an EHR into information. , 2013, Journal of AHIMA.

[3]  Chien-Hung Chen,et al.  Cloudwave: Distributed Processing of "Big Data" from Electrophysiological Recordings for Epilepsy Clinical Research Using Hadoop , 2013, AMIA.

[4]  A. Docherty,et al.  Big Data – ethical perspectives , 2014, Anaesthesia.

[5]  Daniel MacLean,et al.  Big data in small places , 2012, Nature Biotechnology.

[6]  Harnessing big data. How to achieve value. , 2014, Hospitals & health networks.

[7]  William J Mallon The digital cliff. , 2013, Journal of shoulder and elbow surgery.

[8]  Terry Ketchersid,et al.  Big Data in Nephrology: Friend or Foe? , 2013, Blood Purification.

[9]  J Mark Ansermino From the Journal archives: Improving patient outcomes in the era of Big Data , 2014, Canadian Journal of Anesthesia/Journal canadien d'anesthésie.

[10]  Lars Engebretsen,et al.  Prevention and management of non-communicable disease: the IOC consensus statement, Lausanne 2013 , 2013, British Journal of Sports Medicine.

[11]  David B. Lindenmayer,et al.  Analysis: Don't do big-data science backwards , 2013, Nature.

[12]  Jie Tan,et al.  Big Data Bioinformatics , 2014, Journal of cellular physiology.

[13]  Christian Montag,et al.  Psycho-informatics: Big Data shaping modern psychometrics. , 2014, Medical hypotheses.

[14]  Thomas J. Steenburgh,et al.  Motivating Salespeople: What Really Works , 2012, Harvard business review.

[15]  Richard Platt,et al.  Big data in epidemiology: too big to fail? , 2013, Epidemiology.

[16]  O. Hoekenga,et al.  Weighted Correlation Network Analysis (WGCNA) Applied to the Tomato Fruit Metabolome , 2011, PloS one.

[17]  Joel H. Saltz,et al.  Towards building a high performance spatial query system for large scale medical imaging data , 2012, SIGSPATIAL/GIS.

[18]  B. Liu,et al.  An Approach for Identifying Cytokines Based on a Novel Ensemble Classifier , 2013, BioMed research international.

[19]  D. Coddington,et al.  The big deal about big data. , 2013, Healthcare financial management : journal of the Healthcare Financial Management Association.

[20]  Xing-Ming Zhao,et al.  A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics , 2014, BioMed research international.

[21]  Arthur W. Toga,et al.  Human neuroimaging as a “Big Data” science , 2013, Brain Imaging and Behavior.

[22]  Lăcrămioara Stoicu-Tivadar,et al.  Supporting diagnosis and treatment in medical care based on Big Data processing. , 2014, Studies in health technology and informatics.

[23]  Yassene Mohammed,et al.  Cloud parallel processing of tandem mass spectrometry based proteomics data. , 2012, Journal of proteome research.

[24]  N H Shah,et al.  Translational Bioinformatics Embraces Big Data , 2012, Yearbook of Medical Informatics.

[25]  Demetrius J. Porche Men’s Health Big Data , 2014, American journal of men's health.

[26]  J. Ioannidis,et al.  Transforming Epidemiology for 21st Century Medicine and Public Health , 2013, Cancer Epidemiology, Biomarkers & Prevention.

[27]  Karin Chernoff Kaplan,et al.  Value-based physician compensation tackling the complexities: as the healthcare industry continues its historical shift from productivity-based to quality-based payment, new physician compensation models will be needed to keep pace with this trend , 2013 .

[28]  Richard Sal Salcido Big data and disruptive innovation in wound care. , 2013, Advances in skin & wound care.

[29]  Zhongheng Zhang,et al.  Big data and clinical research: focusing on the area of critical care medicine in mainland China. , 2014, Quantitative imaging in medicine and surgery.

[30]  Laszlo Endrenyi,et al.  Crowd-funded micro-grants for genomics and "big data": an actionable idea connecting small (artisan) science, infrastructure science, and citizen philanthropy. , 2013, Omics : a journal of integrative biology.

[31]  Christopher G. Chute,et al.  Some experiences and opportunities for big data in translational research , 2013, Genetics in Medicine.

[32]  B. Huberman Sociology of science: Big data deserve a bigger audience , 2012, Nature.

[33]  T. Davenport,et al.  Data scientist: the sexiest job of the 21st century. , 2012, Harvard business review.

[34]  D I Sessler,et al.  Big Data – and its contributions to peri‐operative medicine , 2014, Anaesthesia.

[35]  Janet Currie,et al.  “Big Data” Versus “Big Brother”: On the Appropriate Use of Large-scale Data Collections in Pediatrics , 2013, Pediatrics.

[36]  Elizabeth Gardner,et al.  The HIT approach to big data. , 2013, Health data management.

[37]  Bill Hamilton Impacts of big data. Potential is huge, so are challenges. , 2013, Health management technology.

[38]  Jinyan Li,et al.  B-cell epitope prediction through a graph model , 2012, BMC Bioinformatics.

[39]  Susan E White,et al.  De-identification and the sharing of big data. , 2013, Journal of AHIMA.

[40]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[41]  Jingfa Xiao,et al.  Bioinformatics clouds for big data manipulation , 2012, Biology Direct.

[42]  Bairong Shen,et al.  Translational Biomedical Informatics in the Cloud: Present and Future , 2013, BioMed research international.

[43]  H Müller,et al.  Health information search to deal with the exploding amount of health information produced. , 2012, Methods of information in medicine.

[44]  Douglas E Green,et al.  Can big data lead us to big savings? , 2013, Radiographics : a review publication of the Radiological Society of North America, Inc.

[45]  Evelyn J. S. Hovenga,et al.  Health Data and Data Governance , 2013, Health Information Governance in a Digital Environment.

[46]  Han Liu,et al.  Statistical analysis of big data on pharmacogenomics. , 2013, Advanced drug delivery reviews.

[47]  Patrick Lambrix,et al.  Workshop on laboratory protocol standards for the Molecular Methods Database. , 2013, New biotechnology.

[48]  William Hersh Educator extraordinaire. Interview by Elizabeth Gardner. , 2013, Health data management.

[49]  Roy D. Sleator,et al.  'Big data', Hadoop and cloud computing in genomics , 2013, J. Biomed. Informatics.

[50]  Rowland R Kao,et al.  Supersize me: how whole-genome sequencing and big data are transforming epidemiology , 2014, Trends in Microbiology.

[51]  Steve Feng,et al.  Crowd-sourced BioGames: managing the big data problem for next-generation lab-on-a-chip platforms. , 2012, Lab on a chip.

[52]  Benjamin H. Brinkmann,et al.  Metadata and annotations for multi-scale electrophysiological data , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[53]  Michelle Dunn,et al.  The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data , 2014, J. Am. Medical Informatics Assoc..

[54]  E. Schadt The changing privacy landscape in the era of big data , 2012, Molecular systems biology.

[55]  Vitalii Doban,et al.  Big data, advanced analytics and the future of comparative effectiveness research. , 2014, Journal of comparative effectiveness research.

[56]  Finding correlations in big data , 2012, Nature Biotechnology.

[57]  Naoaki Ono,et al.  Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology , 2013, Computational and structural biotechnology journal.

[58]  Oswaldo Trelles,et al.  MAPI: a software framework for distributed biomedical applications , 2013, J. Biomed. Semant..

[59]  Jeffery C Ward,et al.  Oncology reimbursement in the era of personalized medicine and big data. , 2014, Journal of oncology practice.

[60]  Andy Podgurski,et al.  Big Bad Data: Law, Public Health, and Biomedical Databases , 2013, The Journal of law, medicine & ethics : a journal of the American Society of Law, Medicine & Ethics.

[61]  Kamran Sedig,et al.  The Challenge of Big Data in Public Health: An Opportunity for Visual Analytics , 2014, Online journal of public health informatics.

[62]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[63]  Werner Callebaut,et al.  Scientific perspectivism: A philosopher of science's response to the challenge of big data biology. , 2012, Studies in history and philosophy of biological and biomedical sciences.

[64]  R. Cameron Craddock,et al.  Clinical applications of the functional connectome , 2013, NeuroImage.

[65]  Richard P Troiano,et al.  Evolution of accelerometer methods for physical activity research , 2014, British Journal of Sports Medicine.

[66]  J. Ioannidis,et al.  Prevention and Management of Non-Communicable Disease: The IOC Consensus Statement, Lausanne 2013 , 2013, Sports Medicine.

[67]  Kyungmin Su,et al.  MOBBED: a computational data infrastructure for handling large collections of event-rich time series datasets in MATLAB , 2013, Front. Neuroinform..

[68]  T. Murdoch,et al.  The inevitable application of big data to health care. , 2013, JAMA.

[69]  V. Marx Biology: The big challenges of big data , 2013, Nature.

[70]  Gang-hoon Kim,et al.  Potentiality of Big Data in the Medical Sector: Focus on How to Reshape the Healthcare System , 2013, Healthcare informatics research.

[71]  Rune Linding,et al.  PROTEINCHALLENGE: crowd sourcing in proteomics analysis and software development. , 2013, Journal of proteomics.

[72]  Athanasios V. Vasilakos,et al.  Big data: From beginning to future , 2016, Int. J. Inf. Manag..

[73]  Buyer's brief: cognitive computing in the age of big data. , 2014, Healthcare financial management : journal of the Healthcare Financial Management Association.

[74]  C. Lynch Big data: How do your data grow? , 2008, Nature.

[75]  J. Mervis U.S. science policy. Agencies rally to tackle big data. , 2012, Science.

[76]  Junfeng Xia,et al.  Biomedical Data Integration, Modeling, and Simulation in the Era of Big Data and Translational Medicine , 2014, BioMed research international.

[77]  I Aguilar,et al.  Breeding and Genetics Symposium: really big data: processing and analysis of very large data sets. , 2012, Journal of animal science.

[78]  Eugene Kolker,et al.  Opportunities and challenges for the life sciences community. , 2012, Omics : a journal of integrative biology.

[79]  G. Nolan,et al.  Computational solutions to large-scale data management and analysis , 2010, Nature Reviews Genetics.

[80]  Tin Wee Tan,et al.  Towards big data science in the decade ahead from ten years of InCoB and the 1st ISCB-Asia Joint Conference , 2011, BMC Bioinformatics.

[81]  Türkay Dereli,et al.  Big Data and Ethics Review for Health Systems Research in LMICs: Understanding Risk, Uncertainty and Ignorance—And Catching the Black Swans? , 2014, The American journal of bioethics : AJOB.

[82]  David S. Liebeskind,et al.  Developing Precision Stroke Imaging , 2014, Front. Neurol..

[83]  David Riley,et al.  Maps, “Big Data,”and Case Reports , 2012, Global advances in health and medicine.

[84]  Jim Beagle Critical to care. , 2013, Health management technology.

[85]  Dylan B. George,et al.  Big Data Opportunities for Global Infectious Disease Surveillance , 2013, PLoS medicine.

[86]  R. Litman,et al.  Complications of laryngeal masks in children: big data comes to pediatric anesthesia. , 2013, Anesthesiology.