Practical investigation of the performance of robust logistic regression to predict the genetic risk of hypertension

Logistic regression is usually applied to investigate the association between inherited genetic variants and a binary disease phenotype. A limitation of standard methods used to estimate the parameters of logistic regression models is their strong dependence on a few observations deviating from the majority of the data.We used data from the Genetic Analysis Workshop 18 to explore the possible benefit of robust logistic regression to estimate the genetic risk of hypertension. The comparison between standard and robust methods relied on the influence of departing hypertension profiles (outliers) on the estimated odds ratios, areas under the receiver operating characteristic curves, and clinical net benefit.Our results confirmed that single outliers may substantially affect the estimated genotype relative risks. The ranking of variants by probability values was different in standard and in robust logistic regression. For cutoff probabilities between 0.2 and 0.6, the clinical net benefit estimated by leave-one-out cross-validation in the investigated sample was slightly larger under robust regression, but the overall area under the receiver operating characteristic curve was larger for standard logistic regression. The potential advantage of robust statistics in the context of genetic association studies should be investigated in future analyses based on real and simulated data.

[1]  John Blangero,et al.  Genome-wide linkage analyses of type 2 diabetes in Mexican Americans: the San Antonio Family Diabetes/Gallbladder Study. , 2005, Diabetes.

[2]  P. O'Connell,et al.  Linkage of type 2 diabetes mellitus and of age at onset to a genetic location on chromosome 10q in Mexican Americans. , 1999, American journal of human genetics.

[3]  P. Franks,et al.  Are symptoms of anxiety and depression risk factors for hypertension? Longitudinal evidence from the National Health and Nutrition Examination Survey I Epidemiologic Follow-up Study. , 1997, Archives of family medicine.

[4]  G. Ehret Genome-Wide Association Studies: Contribution of Genomics to Understanding Blood Pressure and Essential Hypertension , 2010, Current hypertension reports.

[5]  L. Becker,et al.  Hypertension among siblings of persons with premature coronary heart disease. , 1998, Hypertension.

[6]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[7]  C. Bouchard,et al.  Familial risk of high blood pressure in the Canadian population , 2001, American journal of human biology : the official journal of the Human Biology Council.

[8]  Andrew D. Johnson,et al.  Genome-wide association study of blood pressure and hypertension , 2009, Nature Genetics.

[9]  M. Pencina,et al.  Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond , 2008, Statistics in medicine.

[10]  E. Elkin,et al.  Decision Curve Analysis: A Novel Method for Evaluating Prediction Models , 2006, Medical decision making : an international journal of the Society for Medical Decision Making.

[11]  A. Dominiczak,et al.  Genetic basis of blood pressure and hypertension. , 2012, Trends in genetics : TIG.

[12]  P. O’Reilly,et al.  Genome-wide association study identifies eight loci associated with blood pressure , 2009, Nature Genetics.

[13]  D. Levy,et al.  Discovery and replication of novel blood pressure genetic loci in the Women's Genome Health Study , 2011, Journal of hypertension.

[14]  Sanjiv J. Shah,et al.  Whole-genome association study identifies STK39 as a hypertension susceptibility gene , 2009, Proceedings of the National Academy of Sciences.

[15]  J. Blangero,et al.  Genetic and environmental contributions to cardiovascular risk factors in Mexican Americans. The San Antonio Family Heart Study. , 1996, Circulation.

[16]  E. Ronchetti,et al.  Robust Inference for Generalized Linear Models , 2001 .

[17]  Christian Fuchsberger,et al.  Data for Genetic Analysis Workshop 18: human whole genome sequence, blood pressure, and simulated phenotypes in extended pedigrees , 2014, BMC Proceedings.