Big Data and Biomedical Informatics: A Challenging Opportunity

Big data are receiving an increasing attention in biomedicine and healthcare. It is therefore important to understand the reason why big data are assuming a crucial role for the biomedical informatics community. The capability of handling big data is becoming an enabler to carry out unprecedented research studies and to implement new models of healthcare delivery. Therefore, it is first necessary to deeply understand the four elements that constitute big data, namely Volume, Variety, Velocity, and Veracity, and their meaning in practice. Then, it is mandatory to understand where big data are present, and where they can be beneficially collected. There are research fields, such as translational bioinformatics, which need to rely on big data technologies to withstand the shock wave of data that is generated every day. Other areas, ranging from epidemiology to clinical care, can benefit from the exploitation of the large amounts of data that are nowadays available, from personal monitoring to primary care. However, building big data-enabled systems carries on relevant implications in terms of reproducibility of research studies and management of privacy and data access; proper actions should be taken to deal with these issues. An interesting consequence of the big data scenario is the availability of new software, methods, and tools, such as map-reduce, cloud computing, and concept drift machine learning algorithms, which will not only contribute to big data research, but may be beneficial in many biomedical informatics applications. The way forward with the big data opportunity will require properly applied engineering principles to design studies and applications, to avoid preconceptions or over-enthusiasms, to fully exploit the available technologies, and to improve data processing and data management regulations.

[1]  N H Shah,et al.  Translational Bioinformatics Embraces Big Data , 2012, Yearbook of Medical Informatics.

[2]  Guandong Xu,et al.  OLAP*: Effectively and Efficiently Supporting Parallel OLAP over Big Data , 2013, MEDI.

[3]  Taneya Y. Koonce,et al.  Using Health Literacy and Learning Style Preferences to Optimize the Delivery of Health Information , 2012, Journal of health communication.

[4]  Daniele Codecasa Continuous time bayesian network classifiers , 2014 .

[5]  Cécile Viboud,et al.  Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales , 2013, PLoS Comput. Biol..

[6]  Gina Neff,et al.  Why Big Data Won't Cure Us , 2013, Big Data.

[7]  Fabrício F. Costa Big data in biomedicine. , 2014, Drug discovery today.

[8]  G. Gensini,et al.  Pharmacovigilance and use of online health information. , 2013, Trends in pharmacological sciences.

[9]  Kathleen Gray,et al.  Exposome informatics: considerations for the design of future biomedical research information systems , 2014, J. Am. Medical Informatics Assoc..

[10]  Tim Schultz,et al.  Turning healthcare challenges into big data opportunities: A use‐case review across the pharmaceutical development lifecycle , 2013 .

[11]  Nitesh V. Chawla,et al.  Noname manuscript No. (will be inserted by the editor) Learning from Streaming Data with Concept Drift and Imbalance: An Overview , 2022 .

[12]  Lau Caspar Thygesen,et al.  Database on Danish population-based registers for public health and welfare research , 2011, Scandinavian journal of public health.

[13]  T Lecroq,et al.  From genome sequencing to bedside. Findings from the section on bioinformatics and translational informatics. , 2013, Yearbook of medical informatics.

[14]  Dave deBronkart How the e-patient community helped save my life: an essay by Dave deBronkart , 2013, BMJ : British Medical Journal.

[15]  E. Schadt The changing privacy landscape in the era of big data , 2012, Molecular systems biology.

[16]  Dennis P. Wall,et al.  Cloud Computing for Comparative Genomics with Windows Azure Platform , 2012, Evolutionary bioinformatics online.

[17]  Melissa A. Basford,et al.  The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future , 2013, Genetics in Medicine.

[18]  Sean D Dessureault,et al.  Understanding big data , 2016 .

[19]  T. Lecroq,et al.  From Genome Sequencing to Bedside , 2013 .

[20]  Dylan B. George,et al.  Big Data Opportunities for Global Infectious Disease Surveillance , 2013, PLoS medicine.

[21]  V. Suresh Kumar,et al.  Application of Big Data in Data Mining , 2013 .

[22]  T. Davenport,et al.  Data scientist: the sexiest job of the 21st century. , 2012, Harvard business review.

[23]  John M. Fonner,et al.  Leveraging the national cyberinfrastructure for biomedical research , 2013, J. Am. Medical Informatics Assoc..

[24]  K. Coombes,et al.  Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology , 2009, 1010.1092.

[25]  Kup-Sze Choi,et al.  Alternatives to relational database: Comparison of NoSQL and XML approaches for clinical data storage , 2013, Comput. Methods Programs Biomed..

[26]  Richard E Gliklich,et al.  GRACE principles: recognizing high-quality observational studies of comparative effectiveness. , 2010, The American journal of managed care.

[27]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[28]  SHUIGENG ZHOU,et al.  When Cloud Computing Meets Bioinformatics: a Review , 2013, J. Bioinform. Comput. Biol..

[29]  F. Carinci,et al.  Health research and systems’ governance are at risk: should the right to data protection override health? , 2013, Journal of Medical Ethics.

[30]  Kai Wang,et al.  BioPig: a Hadoop-based analytic toolkit for large-scale sequence data , 2013, Bioinform..

[31]  Christopher G. Chute,et al.  Some experiences and opportunities for big data in translational research , 2013, Genetics in Medicine.

[32]  Patrick J Wolfe Making sense of big data , 2013, Proceedings of the National Academy of Sciences.

[33]  Ara Darzi,et al.  Preparing for precision medicine. , 2012, The New England journal of medicine.

[34]  Jimeng Sun,et al.  Biomedical and Healthcare Analytics on Big Data , 2013, AMIA.

[35]  Jeremiah Scholl,et al.  Empowering village doctors and enhancing rural healthcare using cloud computing in a rural area of mainland China , 2014, Comput. Methods Programs Biomed..

[36]  Roy D. Sleator,et al.  'Big data', Hadoop and cloud computing in genomics , 2013, J. Biomed. Informatics.

[37]  Gregory Butler,et al.  A review of genomic data warehousing systems , 2014, Briefings Bioinform..

[38]  Riccardo Bellazzi,et al.  Development and Representation of Health Indicators with Thematic Maps , 2012, MIE.

[39]  F. Sanz,et al.  Improving data and knowledge management to better integrate health care and research , 2013, Journal of internal medicine.

[40]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[41]  R Bellazzi,et al.  Mining health care administrative data with temporal association rules on hybrid events. , 2011, Methods of information in medicine.

[42]  G. Colombo,et al.  Antidiabetic therapy in real practice: indicators for adherence and treatment cost , 2012, Patient preference and adherence.

[43]  Jeffrey Perkel,et al.  MAKING SENSE OF BIG DATA. , 2016, BioTechniques.

[44]  Greg de Lissovoy Big data meets the electronic medical record: a commentary on "identifying patients at increased risk for unplanned readmission". , 2013, Medical care.

[45]  Erik Brynjolfsson,et al.  Big data: the management revolution. , 2012, Harvard business review.

[46]  Carol Friedman,et al.  Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions , 2013, J. Am. Medical Informatics Assoc..

[47]  Iain E. Buchan,et al.  Trustworthy reuse of health data: A transnational perspective , 2013, Int. J. Medical Informatics.

[48]  J. Malin,et al.  Envisioning Watson as a rapid-learning system for oncology. , 2013, Journal of oncology practice.

[49]  Arthur W. Toga,et al.  Human neuroimaging as a “Big Data” science , 2013, Brain Imaging and Behavior.

[50]  Ying-Chih Lin,et al.  Enabling Large-Scale Biomedical Analysis in the Cloud , 2013, BioMed research international.

[51]  Neil Bahroos,et al.  Leverage hadoop framework for large scale clinical informatics applications. , 2013, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[52]  Peter Li,et al.  GigaDB: announcing the GigaScience database , 2012, GigaScience.

[53]  Ke Chen,et al.  Survey of MapReduce frame operation in bioinformatics , 2013, Briefings Bioinform..

[54]  Shlomo Zilberstein,et al.  Using Anytime Algorithms in Intelligent Systems , 1996, AI Mag..

[55]  Indre Zliobaite,et al.  Learning under Concept Drift: an Overview , 2010, ArXiv.

[56]  V. Stodden,et al.  Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals , 2013, PloS one.

[57]  H. Storm,et al.  Cancer registration, public health and the reform of the European data protection framework: Abandoning or improving European public health research? , 2015, European journal of cancer.

[58]  Kevin R Coombes,et al.  Relax with CouchDB--into the non-relational DBMS era of bioinformatics. , 2012, Genomics.

[59]  C. Ball,et al.  Repeatability of published microarray gene expression analyses , 2009, Nature Genetics.

[60]  Søren Lippert,et al.  The Danish National Health Informatics Strategy , 2003, MIE.

[61]  Yike Guo,et al.  tranSMART: An Open Source and Community-Driven Informatics and Data Sharing Platform for Clinical and Translational Research , 2013, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[62]  L. Ohno-Machado,et al.  Genomes in the cloud: balancing privacy rights and the public good. , 2013, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[63]  Joseph Restuccia,et al.  Hospital implementation of health information technology and quality of care: are they related? , 2012, BMC Medical Informatics and Decision Making.

[64]  Lucila Ohno-Machado,et al.  Translational bioinformatics: linking knowledge across biological and clinical realms , 2011, J. Am. Medical Informatics Assoc..

[65]  T. Murdoch,et al.  The inevitable application of big data to health care. , 2013, JAMA.

[66]  Inderveer Chana,et al.  Cloud based intelligent system for delivering health care as a service , 2014, Comput. Methods Programs Biomed..

[67]  Thomas J. Steenburgh,et al.  Motivating Salespeople: What Really Works , 2012, Harvard business review.

[68]  R. Grossman,et al.  A vision for a biomedical cloud , 2012, Journal of internal medicine.