Big data: Some statistical issues

A broad review is given of the impact of big data on various aspects of investigation. There is some but not total emphasis on issues in epidemiological research.

[1]  James M Robins,et al.  Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. , 2016, American journal of epidemiology.

[2]  David R. Cox,et al.  Case-Control Studies , 2014 .

[3]  Eleanor M Pullenayegum,et al.  Longitudinal data subject to irregular observation: A review of methods with a focus on visit processes, assumptions, and study design , 2016, Statistical methods in medical research.

[4]  K. Bhaskaran,et al.  Data Resource Profile: Clinical Practice Research Datalink (CPRD) , 2015, International journal of epidemiology.

[5]  Daniel O. Scharfstein,et al.  Analysis of longitudinal data with irregular, outcome‐dependent follow‐up , 2004 .

[6]  D. Cox,et al.  Large numbers of explanatory variables, a semi-descriptive analysis , 2017, Proceedings of the National Academy of Sciences.

[7]  Aliza K Fink,et al.  The Cystic Fibrosis Foundation Patient Registry. Design and Methods of a National Observational Disease Registry. , 2016, Annals of the American Thoracic Society.

[8]  Ian R White,et al.  Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data , 2014, Statistics in medicine.

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  J. Avorn,et al.  High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data , 2009, Epidemiology.

[11]  S. Stanojevic,et al.  Data Resource Profile: The UK Cystic Fibrosis Registry , 2017, International journal of epidemiology.

[12]  R. Collins,et al.  Cohort profile: the Kadoorie Study of Chronic Disease in China (KSCDC). , 2005, International journal of epidemiology.

[13]  R. Collins What makes UK Biobank special? , 2012, The Lancet.

[14]  J. Goldthorpe Sociology as a Population Science , 2015 .

[15]  H. Goldstein,et al.  Evaluating bias due to data linkage error in electronic healthcare records , 2014, BMC Medical Research Methodology.

[16]  R. Collins,et al.  China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up. , 2011, International journal of epidemiology.

[17]  D. Cox Big data and precision , 2015 .

[18]  Georgia Chenevix-Trench,et al.  An international initiative to identify genetic modifiers of cancer risk in BRCA1 and BRCA2 mutation carriers: the Consortium of Investigators of Modifiers of BRCA1 and BRCA2 (CIMBA) , 2007, Breast Cancer Research.

[19]  R. Hubbard,et al.  Effect of statins on a wide range of health outcomes: a cohort study validated by comparison with randomized trials. , 2009, British journal of clinical pharmacology.