On the use of kernel machines for Mendelian randomization

BackgroundProperly adjusting for unmeasured confounders is critical for health studies in order to achieve valid testing and estimation of the exposure’s causal effect on outcomes. The instrumental variable (IV) method has long been used in econometrics to estimate causal effects while accommodating the effect of unmeasured confounders. Mendelian randomization (MR), which uses genetic variants as the instrumental variables, is an application of the instrumental variable method to biomedical research fields, and has become popular in recent years. One often-used estimator of causal effects for instrumental variables and Mendelian randomization is the two-stage least square estimator (TSLS). The validity of TSLS relies on the accurate prediction of exposure based on IVs in its first stage.ResultsIn this note, we propose to model the link between exposure and genetic IVs using the least-squares kernel machine (LSKM). Some simulation studies are used to evaluate the feasibility of LSKM in TSLS setting.ConclusionsOur results show that LSKM based on genotype score or genotype can be used effectively in TSLS. It may provide higher power when the association between exposure and genetic IVs is nonlinear.

[1]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[2]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[3]  S. Thompson,et al.  Mendelian Randomization , 2015 .

[4]  Seunggeun Lee,et al.  General framework for meta-analysis of rare variants in sequencing association studies. , 2013, American journal of human genetics.

[5]  L. Bailey,et al.  Folate and DNA methylation: a review of molecular mechanisms and the evidence for folate's role. , 2012, Advances in nutrition.

[6]  Tim Geach,et al.  Obesity: Methylation a consequence not a cause , 2017, Nature Reviews Endocrinology.

[7]  Xihong Lin,et al.  A powerful and flexible multilocus association test for quantitative traits. , 2008, American journal of human genetics.

[8]  G. Davey Smith,et al.  Two-step epigenetic Mendelian randomization: a strategy for establishing the causal role of epigenetic processes in pathways to disease. , 2012, International journal of epidemiology.

[9]  Xihong Lin,et al.  Optimal tests for rare variant effects in sequencing association studies. , 2012, Biostatistics.

[10]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[11]  Dylan S. Small,et al.  Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization , 2014, 1401.5755.

[12]  Philip G. Wright,et al.  The tariff on animal and vegetable oils , 1928 .

[13]  Fan Wang,et al.  Causal Genetic Inference Using Haplotypes as Instrumental Variables , 2016, Genetic epidemiology.

[14]  Debashis Ghosh,et al.  Links Between the Sequence Kernel Association and the Kernel-Based Adaptive Cluster Tests , 2017 .

[15]  Hongzhe Li,et al.  Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics , 2013, Journal of the American Statistical Association.

[16]  M. Rieder,et al.  Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. , 2012, American journal of human genetics.

[17]  Iuliana Ionita-Laza,et al.  Sequence kernel association tests for the combined effect of rare and common variants. , 2013, American journal of human genetics.

[18]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[19]  David A. Jaeger,et al.  Problems with Instrumental Variables Estimation when the Correlation between the Instruments and the Endogenous Explanatory Variable is Weak , 1995 .

[20]  P. Hall,et al.  Nonparametric methods for inference in the presence of instrumental variables , 2003, math/0603130.

[21]  S. Thompson,et al.  Bias in causal estimates from Mendelian randomization studies with weak instruments , 2011, Statistics in medicine.

[22]  Johannes Kornhuber,et al.  Global DNA methylation is influenced by smoking behaviour , 2008, European Neuropsychopharmacology.

[23]  M. B. Katan,et al.  Apolipoprotein E isoforms, serum cholesterol, and cancer , 2004 .

[24]  J. Robins,et al.  Estimating exposure effects by modelling the expectation of exposure conditional on confounders. , 1992, Biometrics.

[25]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[26]  Paul H. C. Eilers,et al.  Prenatal parental tobacco smoking, gene specific DNA methylation, and newborns size: the Generation R study , 2015, Clinical Epigenetics.

[27]  H. White,et al.  Instrumental Variables Regression with Independent Observations , 1982 .

[28]  R. Tóth,et al.  An InstrumentalLeast Squares SupportVectorMachine for Nonlinear SystemIdentification ⋆ , 2015 .