Leveraging big data analytics in healthcare enhancement: trends, challenges and opportunities

Clinicians decisions are becoming more and more evidence-based meaning in no other field the big data analytics so promising as in healthcare. Due to the sheer size and availability of healthcare data, big data analytics has revolutionized this industry and promises us a world of opportunities. It promises us the power of early detection, prediction, prevention and helps us to improve the quality of life. Researchers and clinicians are working to inhibit big data from having a positive impact on health in the future. Different tools and techniques are being used to analyze, process, accumulate, assimilate and manage large amount of healthcare data either in structured or unstructured form. In this paper, we would like to address the need of big data analytics in healthcare: why and how can it help to improve life?. We present the emerging landscape of big data and analytical techniques in the five sub-disciplines of healthcare i.e.medical image analysis and imaging informatics, bioinformatics, clinical informatics, public health informatics and medical signal analytics. We presents different architectures, advantages and repositories of each discipline that draws an integrated depiction of how distinct healthcare activities are accomplished in the pipeline to facilitate individual patients from multiple perspectives. Finally the paper ends with the notable applications and challenges in adoption of big data analytics in healthcare.

[1]  Tammy Toney-Butler,et al.  Behavioral Risk Factor Surveillance System (BRFSS) , 2015 .

[2]  Sarah Krein,et al.  Advancing evidence-based care for diabetes: lessons from the Veterans Health Administration. , 2007, Health affairs.

[3]  Leo Anthony Celi,et al.  Big data in global health: improving health in low- and middle-income countries , 2015, Bulletin of the World Health Organization.

[4]  Chad M. Miller,et al.  Consensus Summary Statement of the International Multidisciplinary Consensus Conference on Multimodality Monitoring in Neurocritical Care , 2014, Neurocritical Care.

[5]  R. S. Thakur,et al.  Big Data Analytics: Bioinformatics Perspective , 2016 .

[6]  Carolyn McGregor,et al.  Trends and opportunities for integrated real time neonatal clinical decision support , 2012, Proceedings of 2012 IEEE-EMBS International Conference on Biomedical and Health Informatics.

[7]  Erik Brynjolfsson,et al.  Big data: the management revolution. , 2012, Harvard business review.

[8]  Ryan Abbott,et al.  Big data and pharmacovigilance: Using health information exchanges to revolutionize drug safety , 2013 .

[9]  Hugh J. Watson,et al.  Tutorial: Big Data Analytics: Concepts, Technologies, and Applications , 2014, Commun. Assoc. Inf. Syst..

[10]  José Luís Oliveira,et al.  Telecardiology through ubiquitous Internet services , 2012, Int. J. Medical Informatics.

[11]  Yaw-Ling Lin,et al.  Implementation of a Parallel Protein Structure Alignment Service on Cloud , 2013, International journal of genomics.

[12]  Henning Müller,et al.  Using MapReduce for Large-Scale Medical Image Analysis , 2012, 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology.

[13]  F. Stuart Foster,et al.  Acoustic Angiography: A New Imaging Modality for Assessing Microvasculature Architecture , 2013, Int. J. Biomed. Imaging.

[14]  A. Maćkiewicz,et al.  Principal Components Analysis (PCA) , 1993 .

[15]  Hideaki Sugawara,et al.  DDBJ with new system and face , 2007, Nucleic Acids Res..

[16]  B. Drew,et al.  Insights into the Problem of Alarm Fatigue with Physiologic Monitor Devices: A Comprehensive Observational Study of Consecutive Intensive Care Unit Patients , 2014, PloS one.

[17]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[18]  R. Wilson,et al.  The Next-Generation Sequencing Revolution and Its Impact on Genomics , 2013, Cell.

[19]  Huai Liu,et al.  Scalable Architecture for Personalized Healthcare Service Recommendation using Big Data Lake , 2015, ASSRI.

[20]  Alan R. Hevner,et al.  Healthcare Data Warehousing and Quality Assurance , 2001, Computer.

[21]  Joey F. George,et al.  Toward the development of a big data analytics capability , 2016, Inf. Manag..

[22]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[23]  Viju Raghupathi,et al.  Big data analytics in healthcare: promise and potential , 2014, Health Information Science and Systems.

[24]  Vitalii Doban,et al.  Big data, advanced analytics and the future of comparative effectiveness research. , 2014, Journal of comparative effectiveness research.

[25]  Muhammad Imran Razzak,et al.  A Deep Learning-Based Framework for Automatic Brain Tumors Classification Using Transfer Learning , 2019, Circuits, Systems, and Signal Processing.

[26]  Slobodan Vucetic,et al.  Big data algorithms for visualization and supervised learning , 2013 .

[27]  Weijun Luo,et al.  Pathview: an R/Bioconductor package for pathway-based data integration and visualization , 2013, Bioinform..

[28]  Mike Conway,et al.  Social Media, Big Data, and Mental Health: Current Advances and Ethical Implications. , 2016, Current opinion in psychology.

[29]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[30]  J Lee,et al.  A hypotensive episode predictor for intensive care based on heart rate and blood pressure time series , 2010, 2010 Computing in Cardiology.

[31]  Jiawei Han,et al.  CLARANS: A Method for Clustering Objects for Spatial Data Mining , 2002, IEEE Trans. Knowl. Data Eng..

[32]  Adrian Barbu,et al.  Feature Selection with Annealing for Big Data Learning , 2013 .

[33]  Mohammad-Reza Siadat,et al.  Unstructured medical image query using big data - An epilepsy case study , 2016, J. Biomed. Informatics.

[34]  C Burks,et al.  The GenBank genetic sequence data bank. , 1988, Nucleic acids research.

[35]  M. Anusha,et al.  Big Data-Survey , 2016 .

[36]  Liujuan Cao,et al.  A novel features ranking metric with application to scalable visual and bioinformatics data classification , 2016, Neurocomputing.

[37]  Eric Horvitz,et al.  Predicting postpartum changes in emotion and behavior via social media , 2013, CHI.

[38]  Chiara Garattini,et al.  Big Data Analytics, Infectious Diseases and Associated Ethical Impacts , 2017, Philosophy & Technology.

[39]  Amy Coenen,et al.  Globalization and advances in information and communication technologies: The impact on nursing and health , 2008, Nursing Outlook.

[40]  Filipe Portela,et al.  Enabling Ubiquitous Data Mining in Intensive Care - Features Selection and Data Pre-processing , 2011, ICEIS.

[41]  Dmitri Jdanov,et al.  Human Mortality Database , 2019, Encyclopedia of Gerontology and Population Aging.

[42]  Leanne M. Currie,et al.  The emerging use of social media for health-related purposes in low and middle-income countries: A scoping review , 2018, International Journal of Medical Informatics.

[43]  Imran Siddiqi,et al.  Urdu Nastaliq recognition using convolutional-recursive deep learning , 2017, Neurocomputing.

[44]  Guandong Xu,et al.  Integrating joint feature selection into subspace learning: A formulation of 2DPCA for outliers robust feature selection , 2020, Neural Networks.

[45]  D. Mohr,et al.  Behavioral intervention technologies: evidence review and recommendations for future research in mental health. , 2013, General hospital psychiatry.

[46]  Terry Anthony Byrd,et al.  Business analytics-enabled decision-making effectiveness through knowledge absorptive capacity in health care , 2017, J. Knowl. Manag..

[47]  Chang Liu,et al.  A cloud-based framework for Home-diagnosis service over big medical data , 2015, J. Syst. Softw..

[48]  Fatiha Mrabti,et al.  Feature selection methods and genomic big data: a systematic review , 2019, Journal of Big Data.

[49]  B. Hankey,et al.  The surveillance, epidemiology, and end results program: a national resource. , 1999, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[50]  Priyanka Gupta,et al.  BioWarehouse: a bioinformatics database warehouse toolkit , 2006, BMC Bioinformatics.

[51]  Liang Dong,et al.  Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.

[52]  Carlo Tomasi,et al.  Singular Value Decomposition , 2021, Encyclopedia of Social Network Analysis and Mining.

[53]  May D. Wang,et al.  omniBiomarker: A Web-Based Application for Knowledge-Driven Biomarker Identification , 2013, IEEE Transactions on Biomedical Engineering.

[54]  D. Blumenthal Launching HITECH. , 2010, The New England journal of medicine.

[55]  Roded Sharan,et al.  PathBLAST: a tool for alignment of protein interaction networks , 2004, Nucleic Acids Res..

[56]  Matthew J Hayat,et al.  Cancer statistics, trends, and multiple primary cancer analyses from the Surveillance, Epidemiology, and End Results (SEER) Program. , 2007, The oncologist.

[57]  Sachchidanand Singh,et al.  Big Data analytics , 2012 .

[58]  Rebecca Eynon,et al.  The rise of Big Data: what does it mean for education, technology, and media research? , 2013 .

[59]  Mohammad Kazem Akbari,et al.  An effective model for store and retrieve big health data in cloud computing , 2016, Comput. Methods Programs Biomed..

[60]  Chen Ning An Incremental Grid Density-Based Clustering Algorithm , 2002 .

[61]  E. Siegel,et al.  Artificial Intelligence in Medicine and Cardiac Imaging: Harnessing Big Data and Advanced Computing to Provide Personalized Medical Diagnosis and Treatment , 2013, Current Cardiology Reports.

[62]  Muhammad Imran Razzak,et al.  Urdu Nasta’liq text recognition system based on multi-dimensional recurrent neural network and statistical features , 2017, Neural Computing and Applications.

[63]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[64]  Stanley F. Nelson,et al.  Disease Gene Characterization through Large-Scale Co-Expression Analysis , 2009, PloS one.

[65]  Joel S. Bader,et al.  NeMo: Network Module identification in Cytoscape , 2010, BMC Bioinformatics.

[66]  Amos Bairoch,et al.  The PROSITE database , 2005, Nucleic Acids Res..

[67]  Ric Skinner,et al.  Initiating informatics and GIS support for a field investigation of Bioterrorism: The New Jersey anthrax experience , 2003, International journal of health geographics.

[68]  Guandong Xu,et al.  Multiclass Support Matrix Machines by Maximizing the Inter-Class Margin for Single Trial EEG Classification , 2019, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[69]  Margaret L. Kern,et al.  Social Networking Sites, Depression, and Anxiety: A Systematic Review , 2016, JMIR mental health.

[70]  Eric Horvitz,et al.  Characterizing and predicting postpartum depression from shared facebook data , 2014, CSCW.

[71]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[72]  Stanley Letovsky,et al.  GDB: the Human Genome Database , 1998, Nucleic Acids Res..

[73]  G. Rajagopal,et al.  The path from big data to precision medicine , 2016 .

[74]  Panagiota Galetsi,et al.  A review of the literature on big data analytics in healthcare , 2019, J. Oper. Res. Soc..

[75]  Yichuan Wang,et al.  Exploring the path to big data analytics success in healthcare , 2017 .

[76]  Lidong Wang,et al.  Big Data Analytics in Heart Attack Prediction , 2017 .

[77]  Terry Anthony Byrd,et al.  Big data analytics: Understanding its capabilities and potential benefits for healthcare organizations , 2018 .

[78]  Tim Schultz,et al.  Turning healthcare challenges into big data opportunities: A use‐case review across the pharmaceutical development lifecycle , 2013 .

[79]  Andrew Schwarz,et al.  Examining the Impact of Multicollinearity in Discovering Higher-Order Factor Models , 2014, Commun. Assoc. Inf. Syst..

[80]  S. C. Helm-Murtagh,et al.  Use of Big Data by Blue Cross and Blue Shield of North Carolina , 2014, North Carolina Medical Journal.

[81]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[82]  Lars George,et al.  HBase - The Definitive Guide: Random Access to Your Planet-Size Data , 2011 .

[83]  Rajiv Ranjan,et al.  Parallel Processing of Massive EEG Data with MapReduce , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[84]  Régis Beuscart,et al.  Toward a Literature-Driven Definition of Big Data in Healthcare , 2015, BioMed research international.

[85]  胡恒,et al.  An image processing apparatus, image processing method and a medical imaging device , 2013 .

[86]  P. Saranya,et al.  Survey on Big Data Analytics in Health Care , 2019, 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT).

[87]  Paul S Bradley Implications of Big Data Analytics on Population Health Management , 2013, Big Data.

[88]  H. Kim,et al.  Application of Support Vector Machine for Prediction of Medication Adherence in Heart Failure Patients , 2010, Healthcare informatics research.

[89]  Dylan B. George,et al.  Big Data Opportunities for Global Infectious Disease Surveillance , 2013, PLoS medicine.

[90]  Eric Horvitz,et al.  Social media as a measurement tool of depression in populations , 2013, WebSci.

[91]  Emad A. Mohammed,et al.  Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends , 2014, BioData Mining.

[92]  Ms. Ishtake " Intelligent Heart Disease Prediction System Using Data Mining Techniques " , .

[93]  David Ellsworth,et al.  Application-controlled demand paging for out-of-core visualization , 1997, Proceedings. Visualization '97 (Cat. No. 97CB36155).

[94]  Swarup Roy,et al.  Big Data Analytics in Bioinformatics: A Machine Learning Perspective , 2015, ArXiv.

[95]  A. Gnirke,et al.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data , 2010, Proceedings of the National Academy of Sciences.

[96]  Minsu Park,et al.  Depressive Moods of Users Portrayed in Twitter , 2012 .

[97]  Arif Iqbal Umar,et al.  Efficient leukocyte segmentation and recognition in peripheral blood image. , 2016, Technology and health care : official journal of the European Society for Engineering and Medicine.

[98]  Yichuan Wang,et al.  An integrated big data analytics-enabled transformation model: Application to health care , 2018, Inf. Manag..

[99]  Sholom M. Weiss,et al.  Predictive data mining - a practical guide , 1997 .

[100]  Raman Kumar,et al.  Securing Bioinformatics Cloud for Big Data: Budding Buzzword or a Glance of the Future , 2019, Recent Advances in Computational Intelligence.

[101]  N. Arunkumar,et al.  Optimal deep learning model for classification of lung cancer on CT images , 2019, Future Gener. Comput. Syst..

[102]  Dun Liu,et al.  A fuzzy rough set approach for incremental feature selection on hybrid information systems , 2015, Fuzzy Sets Syst..

[103]  Guandong Xu,et al.  Robust 2D Joint Sparse Principal Component Analysis With F-Norm Minimization For Sparse Modelling: 2D-RJSPCA , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[104]  S. Chin,et al.  Big data in cancer genomics , 2017 .

[105]  Hyoil Han,et al.  An Infrastructure of Stream Data Mining, Fusion and Management for Monitored Patients , 2006, 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06).

[106]  Dong-Hee Shin,et al.  Demystifying big data: Anatomy of big data developmental process , 2016 .

[107]  D. T. Lee,et al.  CloudRS: An error correction algorithm of high-throughput sequencing data based on scalable framework , 2013, 2013 IEEE International Conference on Big Data.

[108]  Chien-Hung Chen,et al.  Heart beats in the cloud: distributed analysis of electrophysiological 'Big Data' using cloud computing for epilepsy clinical research , 2014, J. Am. Medical Informatics Assoc..

[109]  Philip E. Bourne,et al.  The RCSB PDB information portal for structural genomics , 2005, Nucleic Acids Res..

[110]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[111]  Satish Kumar David,et al.  Classification Techniques and Data Mining Tools Used in Medical Bioinformatics , 2019, Big Data Governance and Perspectives in Knowledge Management.

[112]  W. Rouse,et al.  Understanding and Managing the Complexity of Healthcare , 2014 .

[113]  Christophe Nicolle,et al.  Understandable Big Data: A survey , 2015, Comput. Sci. Rev..

[114]  A. Seely,et al.  Continuous Multi-Parameter Heart Rate Variability Analysis Heralds Onset of Sepsis in Adults , 2009, PloS one.

[115]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[116]  Dmitri A. Jdanov,et al.  Human Mortality Database , 2019, Encyclopedia of Gerontology and Population Aging.

[117]  Samuel M. Galvagno,et al.  Identification of dynamic prehospital changes with continuous vital signs acquisition. , 2014, Air medical journal.

[118]  Muhammad Imran Razzak,et al.  An Ontology-based Framework Aiming to Support Cardiac Rehabilitation Program , 2016, KES.

[119]  Dharmendra Patel Big Data Analytics in Bioinformatics , 2018 .

[120]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[121]  Duccio Cavalieri,et al.  Pathway Processor: a tool for integrating whole-genome expression results into metabolic networks. , 2002, Genome research.

[122]  Inderjit S. Dhillon,et al.  A Divide-and-Conquer Solver for Kernel Support Vector Machines , 2013, ICML.

[123]  J Pandia Rajan,et al.  An Internet of Things based physiological signal monitoring and receiving system for virtual enhanced health care network. , 2018, Technology and health care : official journal of the European Society for Engineering and Medicine.

[124]  Thomas Bäck,et al.  Evolutionary computation: Toward a new philosophy of machine intelligence , 1997, Complex..

[125]  Bairong Shen,et al.  Translational Biomedical Informatics in the Cloud: Present and Future , 2013, BioMed research international.

[126]  Viswanath Devanarayan,et al.  Big data to smart data in Alzheimer's disease: The brain health modeling initiative to foster actionable knowledge , 2016, Alzheimer's & Dementia.

[127]  M J Ackerman,et al.  The Visible Human Project: a resource for education. , 1999, Academic medicine : journal of the Association of American Medical Colleges.

[128]  Guandong Xu,et al.  Refining Parkinson’s neurological disorder identification through deep transfer learning , 2019, Neural Computing and Applications.

[129]  Gunasekaran Manogaran,et al.  Health data analytics using scalable logistic regression with stochastic gradient descent , 2018, Int. J. Adv. Intell. Paradigms.

[130]  Karen Tu,et al.  The Cardiovascular Health in Ambulatory Care Research Team (CANHEART): Using Big Data to Measure and Improve Cardiovascular Health and Healthcare Services , 2015, Circulation. Cardiovascular quality and outcomes.

[131]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[132]  Gunasekaran Manogaran,et al.  A survey of big data architectures and machine learning algorithms in healthcare , 2017 .

[133]  Hassan A. Aziz A review of the role of public health informatics in healthcare , 2017, Journal of Taibah University Medical Sciences.

[134]  Jonathan Ling,et al.  Explainable statistical learning in public health for policy development: the case of real-world suicide data , 2019, BMC Medical Research Methodology.

[135]  Hadi Kharrazi,et al.  Public and Population Health Informatics: The Bridging of Big Data to Benefit Communities , 2018, Yearbook of Medical Informatics.

[136]  M. Bodó,et al.  Multimodal noninvasive monitoring of soft tissue wound healing , 2013, Journal of Clinical Monitoring and Computing.

[137]  Rong Zheng,et al.  An Architecture for Healthcare Big Data Management and Analysis , 2016, HIS.

[138]  James S. Duncan,et al.  Medical Image Analysis , 1999, IEEE Pulse.

[139]  R. Hiatt,et al.  A new strategy for cancer control research. , 1999, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[140]  Kim,et al.  Big Data in Healthcare Hype and Hope , 2013 .

[141]  Klaus Engel,et al.  A New Approach for Photorealistic Visualization of Rendered Computed Tomography Images. , 2018, World neurosurgery.

[142]  Chris T. A. Evelo,et al.  Bioinformatics Applications Note Databases and Ontologies Go-elite: a Flexible Solution for Pathway and Ontology Over-representation , 2022 .

[143]  Debra Revere,et al.  Understanding the information needs of public health practitioners: A literature review to inform design of an interactive digital knowledge management system , 2007, J. Biomed. Informatics.

[144]  Lina Yao,et al.  A sensor-based wrist pulse signal processing and lung cancer recognition , 2018, J. Biomed. Informatics.

[145]  Brian D. O'Connor,et al.  SeqWare Query Engine: storing and searching sequence data in the cloud , 2010, BMC Bioinformatics.

[146]  Melnned M. Kantardzic Big Data Analytics , 2013, Lecture Notes in Computer Science.

[147]  B. Lewis,et al.  Methods of using real-time social media technologies for detection and remote monitoring of HIV outcomes. , 2014, Preventive medicine.

[148]  Todd H. Stokes,et al.  chip artifact CORRECTion (caCORRECT): A Bioinformatics System for Quality Assurance of Genomics and Proteomics Array Data , 2007, Annals of Biomedical Engineering.

[149]  Dean Wampler,et al.  Programming Hive - Data Warehouse and Query Language for Hadoop , 2012 .

[150]  Said Jai-Andaloussi,et al.  Medical content based image retrieval by using the Hadoop framework , 2013, ICT 2013.

[151]  João Falcão e Cunha,et al.  Health Twitter Big Bata Management with Hadoop Framework , 2015 .

[152]  M. Saeed Multiparameter Intelligent Monitoring in Intensive Care II ( MIMIC-II ) : A public-access intensive care unit database , 2011 .

[153]  Jimeng Sun,et al.  Big data analytics for healthcare , 2013, KDD.

[154]  Kayvan Najarian,et al.  Big Data Analytics in Healthcare , 2015, BioMed research international.

[155]  R. D'Agostino,et al.  Validation of the Framingham coronary heart disease prediction scores: results of a multiple ethnic groups investigation. , 2001, JAMA.

[156]  Peter J. Hunter,et al.  Big Data, Big Knowledge: Big Data for Personalized Healthcare , 2015, IEEE Journal of Biomedical and Health Informatics.

[157]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[158]  Jonathan Sharp,et al.  An application architecture to facilitate multi-site clinical trial collaboration in the cloud , 2011, SECLOUD '11.

[159]  Guandong Xu,et al.  Big data analytics for preventive medicine , 2019, Neural Computing and Applications.

[160]  Nitesh V. Chawla,et al.  Decision tree learning on very large data sets , 1998, SMC.

[161]  Chao-Tung Yang,et al.  Accessing medical image file with co-allocation HDFS in cloud , 2015, Future Gener. Comput. Syst..

[162]  Henry A. Kautz,et al.  Modeling Spread of Disease from Social Interactions , 2012, ICWSM.

[163]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[164]  A. Coustasse,et al.  The Impact of Big Data on Chronic Disease Management , 2017, The health care manager.

[165]  B. Eswara Reddy,et al.  CLOUD-BASED DEVELOPMENT OF SMART AND CONNECTED DATA IN HEALTHCARE APPLICATION , 2014 .

[166]  Joel J. P. C. Rodrigues,et al.  A novel deep learning based framework for the detection and classification of breast cancer using transfer learning , 2019, Pattern Recognit. Lett..

[167]  Charles Gide,et al.  Cours d'économie politique , 1911 .

[168]  Darcy A. Davis,et al.  Bringing Big Data to Personalized Healthcare: A Patient-Centered Framework , 2013, Journal of General Internal Medicine.

[169]  D. Blumenthal,et al.  The benefits of health information technology: a review of the recent literature shows predominantly positive results. , 2011, Health affairs.

[170]  Catherine Arnott-Smith,et al.  PatientsLikeMe: Consumer Health Vocabulary as a Folksonomy , 2008, AMIA.

[171]  J. McCullough,et al.  The effect of health information technology on quality in U.S. hospitals. , 2010, Health affairs.

[172]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[173]  Gómez Adrián,et al.  MongoDB : An open source alternative for HL 7-CDA clinical documents management , 2013 .

[174]  Wen-Yen Lin,et al.  A quantitative classification of essential and Parkinson's tremor using wavelet transform and artificial neural network on sEMG and accelerometer signals , 2015, 2015 IEEE 12th International Conference on Networking, Sensing and Control.

[175]  John B. Moore,et al.  Singular Value Decomposition , 1994 .

[176]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[177]  Muhammad Imran Razzak,et al.  Microscopic Blood Smear Segmentation and Classification Using Deep Contour Aware CNN and Extreme Machine Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[178]  Imran Siddiqi,et al.  Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks , 2016, Neurocomputing.

[179]  Roy D. Sleator,et al.  'Big data', Hadoop and cloud computing in genomics , 2013, J. Biomed. Informatics.

[180]  Thomas Bck,et al.  Evolutionary computation: Toward a new philosophy of machine intelligence , 1997, Complex..

[181]  Fei Wang,et al.  Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..

[182]  Arvind Sathi,et al.  Big Data Analytics: Disruptive Technologies for Changing the Game , 2012 .

[183]  Kayvan Najarian,et al.  Intracranial pressure level prediction in traumatic brain injury by extracting features from multiple sources and using machine learning methods , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[184]  Zhaohui Zheng,et al.  Stochastic gradient boosted distributed decision trees , 2009, CIKM.

[185]  H. Hannah Inbarani,et al.  A Novel Hybridized Rough Set and Improved Harmony Search Based Feature Selection for Protein Sequence Classification , 2015 .

[186]  Haiyuan Yu,et al.  Detecting overlapping protein complexes in protein-protein interaction networks , 2012, Nature Methods.

[187]  Muhammad Imran,et al.  Efficient Brain Tumor Segmentation With Multiscale Two-Pathway-Group Conventional Neural Networks , 2019, IEEE Journal of Biomedical and Health Informatics.

[188]  Diana Slade,et al.  Big data as a new approach in emergency medicine research , 2015 .

[189]  K. Bakshi,et al.  Considerations for big data: Architecture and approach , 2012, 2012 IEEE Aerospace Conference.

[190]  S. Haller,et al.  Individual Detection of Patients with Parkinson Disease using Support Vector Machine Analysis of Diffusion Tensor Imaging Data: Initial Results , 2012, American Journal of Neuroradiology.

[191]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[192]  Hans-Peter Kriegel,et al.  A Fast Parallel Clustering Algorithm for Large Spatial Databases , 1999, Data Mining and Knowledge Discovery.

[193]  Syed Mohd Ali,et al.  Big data in health care: A mobile based solution , 2017, 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC).

[194]  Hanna Pohjonen,et al.  Images crossing borders: image and workflow sharing on multiple levels , 2011, Insights into imaging.

[195]  G. Maragatham,et al.  Big Genome Data Classification with Random Forests Using VariantSpark , 2018, International Conference on Computer Networks and Communication Technologies.

[196]  Marcin Mazurek,et al.  Applying NoSQL Databases for Operationalizing Clinical Data Mining Models , 2014, BDAS.

[197]  Mauro Fasano,et al.  Statistical analysis of proteomics data: A review on feature selection. , 2019, Journal of proteomics.

[198]  Leping Li,et al.  ART: a next-generation sequencing read simulator , 2012, Bioinform..

[199]  Menashe Benjamin,et al.  From shared data to sharing workflow: merging PACS and teleradiology. , 2010, European journal of radiology.

[200]  Haimonti Dutta,et al.  Distributed Storage of Large-Scale Multidimensional Electroencephalogram Data Using Hadoop and HBase , 2011, Grid and Cloud Database Management.

[201]  Chris T. A. Evelo,et al.  Presenting and exploring biological pathways with PathVisio , 2008, BMC Bioinformatics.

[202]  John Boyle,et al.  SAMQA: error classification and validation of high-throughput sequenced read data , 2011, BMC Genomics.

[203]  G. Pillai,et al.  SVM Based Decision Support System for Heart Disease Classification with Integer-Coded Genetic Algorithm to Select Critical Features , 2009 .

[204]  Issam El Naqa,et al.  Perspectives on making big data analytics work for oncology , 2016 .

[205]  Yao Sun,et al.  HBase, MapReduce, and Integrated Data Visualization for Processing Clinical Signal Data , 2011, AAAI Spring Symposium: Computational Physiology.

[206]  David R. Riley,et al.  CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing , 2011, BMC Bioinformatics.

[207]  Timothy B. Patrick,et al.  Social Media, Big Data, and Public Health Informatics: Ruminating Behavior of Depression Revealed through Twitter , 2015, 2015 48th Hawaii International Conference on System Sciences.

[208]  Nan Yang,et al.  A disease diagnosis and treatment recommendation system based on big data mining and cloud computing , 2018, Inf. Sci..

[209]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[210]  Ankur Teredesai,et al.  Big data solutions for predicting risk-of-readmission for congestive heart failure patients , 2013, 2013 IEEE International Conference on Big Data.

[211]  K. Marsolo,et al.  Applications of Business Analytics in Healthcare. , 2014, Business horizons.

[212]  Hong Liu,et al.  Large-Scale Clinical Data Management and Analysis System Based on Cloud Computing , 2014 .

[213]  Jimeng Sun,et al.  PARAMO: A PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records , 2014, J. Biomed. Informatics.

[214]  Thomas T. H. Wan,et al.  Healthcare Informatics Research: From Data to Evidence-Based Management , 2006, Journal of Medical Systems.

[215]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[216]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[217]  Shahriar Akter,et al.  How ‘Big Data’ Can Make Big Impact: Findings from a Systematic Review and a Longitudinal Case Study , 2015 .

[218]  Christof Karmonik,et al.  Workflow for Visualization of Neuroimaging Data with an Augmented Reality Device , 2018, Journal of Digital Imaging.

[219]  Chien-Hung Chen,et al.  Cloudwave: Distributed Processing of "Big Data" from Electrophysiological Recordings for Epilepsy Clinical Research Using Hadoop , 2013, AMIA.

[220]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[221]  Marcel Salathé,et al.  Ethical Challenges of Big Data in Public Health , 2015, PLoS Comput. Biol..

[222]  Mohammed Saeed,et al.  Predicting ICU hemodynamic instability using continuous multiparameter trends , 2008, 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[223]  Margaret A. Pericak-Vance,et al.  Pathway-PDT: a flexible pathway analysis tool for nuclear families , 2013, BMC Bioinformatics.

[224]  Davar Giveki,et al.  Automatic Detection of Diabetes Diagnosis using Feature Weighted Support Vector Machines based on Mutual Information and Modified Cuckoo Search , 2012, ArXiv.

[225]  Melanie Swan,et al.  The Quantified Self: Fundamental Disruption in Big Data Science and Biological Discovery , 2013, Big Data.

[226]  B. Sathiyabhama,et al.  Parkinson's Brain Disease Prediction Using Big Data Analytics , 2016 .

[227]  Lorraine M. Fernandes,et al.  Big Data, Bigger Outcomes , 2012 .

[228]  R Sadhana,et al.  Analysis of Diabetic Data Set Using Hive and R , 2014 .

[229]  Yang Jin,et al.  A Distributed Storage Model for EHR Based on HBase , 2011, 2011 International Conference on Information Management, Innovation Management and Industrial Engineering.

[230]  Guang-Zhong Yang,et al.  Deep Learning for Health Informatics , 2017, IEEE Journal of Biomedical and Health Informatics.

[231]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[232]  Arshdeep Bahga,et al.  A Cloud-based Approach for Interoperable Electronic Health Records (EHRs) , 2013, IEEE Journal of Biomedical and Health Informatics.

[233]  C. Friedman,et al.  A drug-adverse event extraction algorithm to support pharmacovigilance knowledge mining from PubMed citations. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[234]  K. Graham,et al.  Monitor alarm fatigue: standardizing use of physiological monitors and decreasing nuisance alarms. , 2010, American journal of critical care : an official publication, American Association of Critical-Care Nurses.

[235]  Yao Zhang,et al.  Uses of information and communication technologies in HIV self-management: A systematic review of global literature , 2017, Int. J. Inf. Manag..

[236]  Rinkle Rani,et al.  Managing Data in Healthcare Information Systems: Many Models, One Solution , 2015, Computer.

[237]  Robert P. W. Duin,et al.  Sammon's mapping using neural networks: A comparison , 1997, Pattern Recognit. Lett..

[238]  Henning Hermjakob,et al.  Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework , 2012, BMC Bioinformatics.

[239]  B. Jeong,et al.  Activities on Facebook Reveal the Depressive State of Users , 2013, Journal of medical Internet research.

[240]  Chad M. Miller,et al.  Consensus summary statement of the International Multidisciplinary Consensus Conference on Multimodality Monitoring in Neurocritical Care , 2014, Intensive Care Medicine.

[241]  Hiroyuki Ohsaki,et al.  On estimating depressive tendencies of Twitter users utilizing their tweet data , 2013, 2013 IEEE Virtual Reality (VR).

[242]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[243]  V. Neelima,et al.  A Review of Data Mining using Bigdata in Health Informatics , 2015 .

[244]  Tahani Daghistani,et al.  Discovering Diabetes Complications: an Ontology Based Model , 2015, Acta informatica medica : AIM : journal of the Society for Medical Informatics of Bosnia & Herzegovina : casopis Drustva za medicinsku informatiku BiH.

[245]  R. Manimegalai,et al.  Medical Image Retrieval System in Grid Using Hadoop Framework , 2014, 2014 International Conference on Computational Science and Computational Intelligence.

[246]  N Peek,et al.  Technical Challenges for Big Data in Biomedicine and Health: Data Sources, Infrastructure, and Analytics , 2014, Yearbook of Medical Informatics.

[247]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[248]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[249]  Sandeep Tata,et al.  BlueSNP: R package for highly scalable genome-wide association studies using Hadoop clusters , 2013, Bioinform..

[250]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2000, Nucleic Acids Res..

[251]  Chenguang He,et al.  Toward Ubiquitous Healthcare Services With a Novel Efficient Cloud Platform , 2013, IEEE Transactions on Biomedical Engineering.

[252]  Muhammad Imran Razzak,et al.  Malarial Parasite Classification using Recurrent Neural Network , 2015 .

[253]  Robert B. Hartlage,et al.  This PDF file includes: Materials and Methods , 2009 .

[254]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[255]  Hong Zheng,et al.  Massive Medical Images Retrieval System Based on Hadoop , 2014, J. Multim..

[256]  Lei Xing,et al.  Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment. , 2011, Medical physics.

[257]  Futao Zhang,et al.  FastGCN: A GPU Accelerated Tool for Fast Gene Co-Expression Networks , 2015, PloS one.

[258]  Lars Johansson,et al.  Automated comparison of last hospital main diagnosis and underlying cause of death ICD10 codes, France, 2008–2009 , 2014, BMC Medical Informatics and Decision Making.

[259]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[260]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[261]  Hans-Jürgen Bandelt,et al.  mtDNA data mining in GenBank needs surveying. , 2009, American journal of human genetics.

[262]  Jong-Hyun Park,et al.  Identifying and prioritizing critical factors for promoting the implementation and usage of big data in healthcare , 2017 .

[263]  Muhammad Imran Razzak,et al.  Deep Learning for Medical Image Processing: Overview, Challenges and Future , 2017, ArXiv.

[264]  Konstantinos Krampis,et al.  Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community , 2012, BMC Bioinformatics.

[265]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[266]  Yasser El-Sonbaty,et al.  MedCloud: Healthcare cloud computing system , 2012, 2012 International Conference for Internet Technology and Secured Transactions.

[267]  Hadi Kharrazi,et al.  A public health perspective on using electronic health records to address social determinants of health: The potential for a national system of local community health records in the United States , 2019, Int. J. Medical Informatics.

[268]  David E. Goldberg,et al.  A niched Pareto genetic algorithm for multiobjective optimization , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[269]  B. Kuster,et al.  Mass-spectrometry-based draft of the human proteome , 2014, Nature.

[270]  Michael C. Schatz,et al.  CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..

[271]  Ben Langmead,et al.  Genotyping in the Cloud with Crossbow , 2012, Current protocols in bioinformatics.

[272]  Taghi M. Khoshgoftaar,et al.  A review of data mining using big data in health informatics , 2013, Journal Of Big Data.

[273]  Lăcrămioara Stoicu-Tivadar,et al.  Supporting diagnosis and treatment in medical care based on Big Data processing. , 2014, Studies in health technology and informatics.

[274]  Marie-Claude Blatter,et al.  Protein variety and functional diversity: Swiss-Prot annotation in its biological context. , 2005, Comptes rendus biologies.

[275]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[276]  Jean Yee Hwa Yang,et al.  Direction pathway analysis of large-scale proteomics data reveals novel features of the insulin action pathway , 2014, Bioinform..

[277]  Joaquim A. Jorge,et al.  Challenges and approaches to interactive visualization in healthcare workspaces , 2019, Annals of Medicine.

[278]  Anol Bhattacherjee,et al.  Physicians' resistance toward healthcare information technology: a theoretical model and empirical test , 2007, Eur. J. Inf. Syst..

[279]  Jimeng Sun,et al.  A System for Mining Temporal Physiological Data Streams for Advanced Prognostic Decision Support , 2010, 2010 IEEE International Conference on Data Mining.