Testing and estimation in marker‐set association study using semiparametric quantile regression kernel machine

We consider quantile regression for partially linear models where an outcome of interest is related to covariates and a marker set (e.g., gene or pathway). The covariate effects are modeled parametrically and the marker set effect of multiple loci is modeled using kernel machine. We propose an efficient algorithm to solve the corresponding optimization problem for estimating the effects of covariates and also introduce a powerful test for detecting the overall effect of the marker set. Our test is motivated by traditional score test, and borrows the idea of permutation test. Our estimation and testing procedures are evaluated numerically and applied to assess genetic association of change in fasting homocysteine level using the Vitamin Intervention for Stroke Prevention Trial data.

[1]  G. Wahba,et al.  Some results on Tchebycheffian spline functions , 1971 .

[2]  Steven A. Orszag,et al.  CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS , 1978 .

[3]  R. Koenker,et al.  Hierarchical Spline Models for Conditional Quantiles and the Demand for Electricity , 1990 .

[4]  G. Wahba Spline models for observational data , 1990 .

[5]  Jianqing Fan,et al.  Local polynomial modelling and its applications , 1994 .

[6]  Pin T. Ng,et al.  Quantile smoothing splines , 1994 .

[7]  Doug Nychka,et al.  A Nonparametric Regression Approach to Syringe Grading for Quality Improvement , 1995 .

[8]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[9]  M. C. Jones,et al.  Local Linear Quantile Regression , 1998 .

[10]  Song Yang,et al.  Censored Median Regression Using Weighted Empirical Survival and Hazard Functions , 1999 .

[11]  I. Pogribny,et al.  Increase in Plasma Homocysteine Associated with Parallel Increases in Plasma S-Adenosylhomocysteine and Lymphocyte DNA Hypomethylation* , 2000, The Journal of Biological Chemistry.

[12]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[13]  J. Olivier,et al.  Transcobalamin codon 259 polymorphism in HT-29 and Caco-2 cells and in Caucasians: relation to transcobalamin and homocysteine concentration in blood. , 2001, Blood.

[14]  H. Blom,et al.  A 31 bp VNTR in the cystathionine β-synthase (CBS) gene is associated with reduced CBS activity and elevated post-load homocysteine levels , 2001, European Journal of Human Genetics.

[15]  Chong Gu Smoothing Spline Anova Models , 2002 .

[16]  Roger Koenker,et al.  Elastic and Plastic Splines: Some Experimental Comparisons , 2002 .

[17]  Sokbae Lee,et al.  EFFICIENT SEMIPARAMETRIC ESTIMATION OF A PARTIALLY LINEAR QUANTILE REGRESSION MODEL , 2003, Econometric Theory.

[18]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[19]  L. Chambless,et al.  Lowering homocysteine in patients with ischemic stroke to prevent recurrent stroke, myocardial infarction, and death: the Vitamin Intervention for Stroke Prevention (VISP) randomized controlled trial. , 2004, JAMA.

[20]  L. Chambless,et al.  Lowering homocysteine in patients with ischemic stroke to prevent recurrent stroke, myocardial infarction, and death: the Vitamin Intervention for Stroke Prevention (VISP) randomized controlled trial. , 2004, JAMA.

[21]  R. Koenker,et al.  Penalized triograms: total variation regularization for bivariate smoothing , 2004 .

[22]  B. Ripley,et al.  Semiparametric Regression: Preface , 2003 .

[23]  Amitabh Sharma,et al.  Mining literature for a comprehensive pathway analysis: A case study for retrieval of homocysteine related genes for genetic and epigenetic studies , 2006, Lipids in Health and Disease.

[24]  Yiguo Sun Semiparametric Efficient Estimation of Partially Linear Quantile Regression Models , 2005 .

[25]  R. Koenker Quantile Regression: Name Index , 2005 .

[26]  Ming Yuan,et al.  GACV for quantile smoothing splines , 2006, Comput. Stat. Data Anal..

[27]  Alexander J. Smola,et al.  Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..

[28]  Hui-xia,et al.  Detecting Differential Expressions in GeneChip Microarray Studies: A Quantile Approach , 2006 .

[29]  N. Schork,et al.  Generalized genomic distance-based regression methodology for multilocus association analysis. , 2006, American journal of human genetics.

[30]  Ji Zhu,et al.  Quantile Regression in Reproducing Kernel Hilbert Spaces , 2007 .

[31]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[32]  S. Vollset,et al.  Large‐scale population‐based metabolic phenotyping of thirteen genetic polymorphisms related to one‐carbon metabolism , 2007, Human mutation.

[33]  P. Bickel,et al.  Local polynomial regression on unknown manifolds , 2007, 0708.0983.

[34]  H. Bang,et al.  Assessment of pre- and post-methionine load homocysteine for prediction of recurrent stroke and coronary artery disease in the Vitamin Intervention for Stroke Prevention Trial. , 2008, Atherosclerosis.

[35]  Xihong Lin,et al.  A powerful and flexible multilocus association test for quantitative traits. , 2008, American journal of human genetics.

[36]  J. Kalita,et al.  Relationship of homocysteine with other risk factors and outcome of ischemic stroke , 2009, Clinical Neurology and Neurosurgery.

[37]  Runze Li,et al.  Variable Selection for Partially Linear Models With Measurement Errors , 2009, Journal of the American Statistical Association.

[38]  Toshiko Tanaka,et al.  Genome-wide association study of vitamin B6, vitamin B12, folate, and homocysteine blood concentrations. , 2009, American journal of human genetics.

[39]  Peter Kraft,et al.  Comprehensive screen of genetic variation in DNA repair pathway genes and postmenopausal breast cancer risk , 2010, Breast Cancer Research and Treatment.

[40]  Deanne M. Taylor,et al.  Powerful SNP-set analysis for case-control genome-wide association studies. , 2010, American journal of human genetics.

[41]  Yuedong Wang Smoothing Spline ANOVA , 2011 .

[42]  Xihong Lin,et al.  Powerful Tests for Detecting a Gene Effect in the Presence of Possible Gene–Gene Interactions Using Garrote Kernel Machines , 2011, Biometrics.

[43]  Yufeng Liu,et al.  Simultaneous multiple non-crossing quantile regression estimation using kernel constraints , 2011, Journal of nonparametric statistics.

[44]  K. Furie,et al.  Transcobalamin 2 variant associated with poststroke homocysteine modifies recurrent stroke risk , 2011, Neurology.

[45]  Xihong Lin,et al.  Rare Variant Association Testing for Sequencing Data Using the Sequence Kernel Association Test ( SKAT ) , 2011 .

[46]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[47]  Arnab Maity,et al.  Multivariate Phenotype Association Analysis by Marker‐Set Kernel Machine Regression , 2012, Genetic epidemiology.

[48]  Z. J. Daye,et al.  SNP-set analysis replicates acute lung injury genetic risk factors , 2012, BMC Medical Genetics.

[49]  Wolfgang Härdle,et al.  Bootstrap confidence bands and partial linear quantile regression , 2012, J. Multivar. Anal..

[50]  D. Gudbjartsson,et al.  Genetic Architecture of Vitamin B12 and Folate Levels Uncovered Applying Deeply Sequenced Large Datasets , 2013, PLoS genetics.

[51]  Yan Yu,et al.  Partially linear modeling of conditional quantiles using penalized splines , 2014, Comput. Stat. Data Anal..

[52]  A. MacFarlane,et al.  Genetic modifiers of folate, vitamin B-12, and homocysteine status in a cross-sectional study of the Canadian population. , 2015, The American journal of clinical nutrition.

[53]  田原 康玄,et al.  生活習慣病とgenome-wide association study , 2015 .

[54]  Furno Marilena,et al.  Quantile Regression , 2018, Wiley Series in Probability and Statistics.