Volume and Value of Big Healthcare Data.

Modern scientific inquiries require significant data-driven evidence and trans-disciplinary expertise to extract valuable information and gain actionable knowledge about natural processes. Effective evidence-based decisions require collection, processing and interpretation of vast amounts of complex data. The Moore's and Kryder's laws of exponential increase of computational power and information storage, respectively, dictate the need rapid trans-disciplinary advances, technological innovation and effective mechanisms for managing and interrogating Big Healthcare Data. In this article, we review important aspects of Big Data analytics and discuss important questions like: What are the challenges and opportunities associated with this biomedical, social, and healthcare data avalanche? Are there innovative statistical computing strategies to represent, model, analyze and interpret Big heterogeneous data? We present the foundation of a new compressive big data analytics (CBDA) framework for representation, modeling and inference of large, complex and heterogeneous datasets. Finally, we consider specific directions likely to impact the process of extracting information from Big healthcare data, translating that information to knowledge, and deriving appropriate actions.

[1]  A. O'Hagan,et al.  Kendall's Advanced Theory of Statistics, Vol. 2b: Bayesian Inference. , 1996 .

[2]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[3]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[4]  Anthony O'Hagan,et al.  Kendall's Advanced Theory of Statistics, volume 2B: Bayesian Inference, second edition , 2004 .

[5]  P Wu,et al.  On assessing model fit for distribution‐free longitudinal models under missing data , 2014, Statistics in medicine.

[6]  Arthur W. Toga,et al.  The perfect neuroimaging-genetics-computation storm: collision of petabytes of data, millions of hardware devices and thousands of software tools , 2013, Brain Imaging and Behavior.

[7]  Paul M. Thompson,et al.  Multi-site genetic analysis of diffusion images and voxelwise heritability analysis: A pilot project of the ENIGMA–DTI working group , 2013, NeuroImage.

[8]  Michael W. Weiner,et al.  A commonly carried allele of the obesity-related FTO gene is associated with reduced brain volume in the healthy elderly , 2010, Proceedings of the National Academy of Sciences.

[9]  R. Unger,et al.  Application of machine learning algorithms for clinical predictive modeling: a data-mining approach in SCT , 2014, Bone Marrow Transplantation.

[10]  Subhabrata Chakraborti,et al.  Nonparametric Statistical Inference , 2011, International Encyclopedia of Statistical Science.

[11]  M. Kendall,et al.  Kendall's advanced theory of statistics , 1995 .

[12]  Peter Boesiger,et al.  Compressed sensing in dynamic MRI , 2008, Magnetic resonance in medicine.

[13]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[14]  Henry Cavendish,et al.  Experiments to determine the Density of the Earth , 2010 .

[15]  Yu-Chung N. Cheng,et al.  Magnetic Resonance Imaging: Physical Principles and Sequence Design , 1999 .

[16]  Philip H. Ramsey Nonparametric Statistical Methods , 1974, Technometrics.

[17]  D. Feinleib The Big Data Landscape , 2014 .

[18]  Mikhail Nikulin,et al.  Non-parametric tests for complete data , 2011 .

[19]  K. Gödel Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I , 1931 .

[20]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[21]  Ivo D. Dinov,et al.  SOCR data dashboard: an integrated big data archive mashing medicare, labor, census and econometric information , 2015, Journal of Big Data.

[22]  Paul Strauss,et al.  Magnetic Resonance Imaging Physical Principles And Sequence Design , 2016 .

[23]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[24]  Emmanuel J. Candès,et al.  A Probabilistic and RIPless Theory of Compressed Sensing , 2010, IEEE Transactions on Information Theory.

[25]  Rachel Ward,et al.  New and Improved Johnson-Lindenstrauss Embeddings via the Restricted Isometry Property , 2010, SIAM J. Math. Anal..

[26]  Gonzalo Mateos,et al.  Modeling and Optimization for Big Data Analytics: (Statistical) learning tools for our era of data deluge , 2014, IEEE Signal Processing Magazine.