Distance-Based Analysis with Quantile Regression Models

Non-standard structured, multivariate data are emerging in many research areas, including genetics and genomics, ecology, and social science. Suitably defined pairwise distance measures are commonly used in distance-based analysis to study the association between the variables. In this work, we consider a linear quantile regression model for pairwise distances. We investigate the large sample properties of an estimator of the unknown coefficients and propose statistical inference procedures correspondingly. Extensive simulations provide evidence of satisfactory finite sample properties of the proposed method. Finally, we applied the method to a microbiome association study to illustrate its utility.

[1]  Brian H. McArdle,et al.  FITTING MULTIVARIATE MODELS TO COMMUNITY DATA: A COMMENT ON DISTANCE‐BASED REDUNDANCY ANALYSIS , 2001 .

[2]  N. Mantel The detection of disease clustering and a generalized regression approach. , 1967, Cancer research.

[3]  R. Koenker,et al.  Regression Quantiles , 2007 .

[4]  C S Bergeman,et al.  Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic , 2017, Psychometrika.

[5]  C. Ghose,et al.  Clostridium difficile infection in the twenty-first century , 2013, Emerging Microbes & Infections.

[6]  Hongzhe Li,et al.  Associating microbiome composition with environmental covariates using generalized UniFrac distances , 2012, Bioinform..

[7]  Marti J. Anderson,et al.  A new method for non-parametric multivariate analysis of variance in ecology , 2001 .

[8]  N. Schork,et al.  Generalized genomic distance-based regression methodology for multilocus association analysis. , 2006, American journal of human genetics.

[9]  J. Faraway Regression for non-Euclidean data using distance matrices , 2014 .

[10]  Timothy L. Tickle,et al.  Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment , 2012, Genome Biology.

[11]  R. Koenker,et al.  Robust Tests for Heteroscedasticity Based on Regression Quantiles , 1982 .

[12]  C. Huttenhower,et al.  Inflammatory bowel disease as a model for translating the microbiome. , 2014, Immunity.

[13]  Giovanni Montana,et al.  Distance-based differential analysis of gene curves , 2011, Bioinform..

[14]  Simpler Bootstrap Estimation of the Asymptotic Variance of U�?Statistic�?Based Estimators , 2015 .

[15]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[16]  Marti J. Anderson,et al.  Permutational Multivariate Analysis of Variance (PERMANOVA) , 2017 .

[17]  Yuehua Cui,et al.  Gene-centric gene–gene interaction: A model-based kernel machine method , 2012, 1209.6502.

[18]  C. M. Cuadras,et al.  A distance based regression model for prediction with mixed data , 1990 .

[19]  Leiba Rodman,et al.  Algebraic Riccati equations , 1995 .

[20]  J. Faraway Regression for non-Euclidean data using distance matrices , 2013, 1303.3750.

[21]  Jeremy W. Lichstein,et al.  Multiple regression on distance matrices: a multivariate spatial analysis tool , 2007, Plant Ecology.

[22]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[23]  Giovanni Montana,et al.  Distance‐based analysis of variance: Approximate inference , 2014, Stat. Anal. Data Min..

[24]  Lisa G Winston,et al.  Burden of Clostridium difficile Infection in the United States , 2015 .

[25]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[26]  Bo E. Honoré,et al.  Pairwise difference estimators of censored and truncated regression models , 1994 .

[27]  A. Laub A schur method for solving algebraic Riccati equations , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[28]  J. T. Curtis,et al.  An Ordination of the Upland Forest Communities of Southern Wisconsin , 1957 .

[29]  W. Hoeffding The strong law of large numbers for u-statistics. , 1961 .

[30]  R. Koenker,et al.  Goodness of Fit and Related Inference Processes for Quantile Regression , 1999 .

[31]  Brooks D. Rabideau,et al.  Cancer Immune Checkpoint Inhibitor Therapy and the Gut Microbiota , 2019, Integrative cancer therapies.

[32]  Y. Taur,et al.  Increased GVHD-related mortality with broad-spectrum antibiotic use after allogeneic hematopoietic stem cell transplantation in human patients and mice , 2016, Science Translational Medicine.

[33]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[34]  Kevin S. Bonham,et al.  Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases , 2019, Nature.