Quantile regression for survival data in modern cancer research: expanding statistical tools for precision medicine

Abstract Quantile regression links the whole distribution of an outcome to the covariates of interest and has become an important alternative to commonly used regression models. However, the presence of censored data such as survival time, often the main endpoint in cancer studies, has hampered the use of quantile regression techniques because of the incompleteness of data. With the advent of the precision medicine era and availability of high throughput data, quantile regression with high-dimensional predictors has attracted much attention and provided added insight compared to traditional regression approaches. This paper provides a practical guide for using quantile regression for right censored outcome data with covariates of low- or high-dimensionality. We frame our discussion using a dataset from the Boston Lung Cancer Survivor Cohort, a hospital-based prospective cohort study, with the goals of broadening the scope of cancer research, maximizing the utility of collected data, and offering useful statistical alternatives. We use quantile regression to identify clinical and molecular predictors, for example CpG methylation sites, associated with high-risk lung cancer patients, for example those with short survival.

[1]  Elisabeth Waldmann,et al.  Quantile regression: A short story on how and why , 2018 .

[2]  N. Fortin,et al.  Unconditional Quantile Regressions , 2007 .

[3]  D. Bowen,et al.  Delivery Of Cascade Screening For Hereditary Conditions: A Scoping Review Of The Literature. , 2018, Health affairs.

[4]  S. Lam,et al.  Genome-scale analysis of DNA methylation in lung adenocarcinoma and integration with mRNA expression , 2012, Genome research.

[5]  M. Cowley,et al.  Integration of genomics, high throughput drug screening, and personalized xenograft models as a novel precision medicine paradigm for high risk pediatric cancer , 2018, Cancer biology & therapy.

[6]  P. Shen Median regression model with left truncated and interval-censored data , 2013 .

[7]  Joseph A. Tainter,et al.  Emerging Trends in the Social and Behavioral Sciences: An Interdisciplinary, Searchable, and Linkable Resource , 2015 .

[8]  R. Koenker Quantile regression for longitudinal data , 2004 .

[9]  Stephen Portnoy,et al.  Censored Regression Quantiles , 2003 .

[10]  Alexander Hanbo Li,et al.  Censored Quantile Regression Forests , 2019, ArXiv.

[11]  Anders M. Dale,et al.  Precision medicine screening using whole-genome sequencing and advanced imaging to identify disease risk in adults , 2017, Proceedings of the National Academy of Sciences.

[12]  Stuart R. Lipsitz,et al.  Semiparametric Analysis of Interval-Censored Survival Data with Median Regression Model , 2016 .

[13]  R. Koenker,et al.  Regression Quantiles , 2007 .

[14]  Limin Peng,et al.  Survival Analysis With Quantile Regression Models , 2008 .

[15]  D. Zeng,et al.  Quantile Regression Models for Current Status Data. , 2016, Journal of statistical planning and inference.

[16]  J. Alcaraz,et al.  Aberrant DNA methylation in non-small cell lung cancer-associated fibroblasts , 2015, Carcinogenesis.

[17]  Ruth M Pfeiffer,et al.  Breast cancer risk factors, survival and recurrence, and tumor molecular subtype: analysis of 3012 women from an indigenous Asian population , 2018, Breast Cancer Research.

[18]  M. Karami,et al.  Application of Censored Quantile Regression to Determine Overall Survival Related Factors in Breast Cancer , 2016, Journal of research in health sciences.

[19]  R. Wilke,et al.  Quantile Regression Methods , 2015 .

[20]  Lan Wang,et al.  Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data , 2013, 1304.2186.

[21]  M. Szyf,et al.  Role of epigenetics in cancer initiation and progression. , 2011, Advances in experimental medicine and biology.

[22]  R. Koenker,et al.  Reappraising Medfly Longevity , 2001 .

[23]  Nicolai Meinshausen,et al.  Quantile Regression Forests , 2006, J. Mach. Learn. Res..

[24]  Y. Ko,et al.  The Heterogeneity in Risk Factors of Lung Cancer and the Difference of Histologic Distribution between Genders in Taiwan , 2001, Cancer Causes & Control.

[25]  W. Manning,et al.  Thinking beyond the mean: a practical guide for using quantile regression methods for health services research , 2013, Shanghai archives of psychiatry.

[26]  B. Efron The two sample problem with censored data , 1967 .

[27]  G. Pfeifer,et al.  DNA methylation biomarkers in lung cancer diagnosis: closer to practical use? , 2017 .

[28]  B. Cade,et al.  A gentle introduction to quantile regression for ecologists , 2003 .

[29]  Anna Lindgren,et al.  Quantile regression with censored data using generalized L 1 minimization , 1997 .

[30]  Jung-Yu Cheng,et al.  Quantile regression methods for left-truncated and right-censored data , 2016 .

[31]  Peter A. Jones,et al.  Epigenetics in cancer. , 2010, Carcinogenesis.

[32]  Stef van Buuren,et al.  Worm plot to diagnose fit in quantile regression , 2007 .

[33]  M. Esteller Epigenetics in cancer. , 2008, The New England journal of medicine.

[34]  F. Filipp Crosstalk between epigenetics and metabolism—Yin and Yang of histone demethylases and methyltransferases in cancer , 2017, Briefings in functional genomics.

[35]  Dylan S. Small,et al.  Dose-Escalated Irradiation and Overall Survival in Men With Nonmetastatic Prostate Cancer. , 2015, JAMA oncology.

[36]  Shuangge Ma,et al.  Robust identification of gene-environment interactions for prognosis using a quantile partial correlation approach. , 2019, Genomics.

[37]  D. Sargent,et al.  Associations between cigarette smoking status and colon cancer prognosis among participants in North Central Cancer Treatment Group Phase III Trial N0147. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[38]  Hyokyoung Grace Hong,et al.  Prediction of Functional Status for the Elderly Based on a New Ordinal Regression Model , 2010 .

[39]  Yi Li,et al.  Feature selection of ultrahigh-dimensional covariates with survival outcomes: a selective review , 2017, Applied mathematics : a journal of Chinese universities.

[40]  P. Kapur,et al.  A CpG-methylation-based assay to predict survival in clear cell renal cell carcinoma , 2015, Nature Communications.

[41]  Xiuli Du,et al.  Quantile regression for interval censored data , 2017 .

[42]  Roger Koenker,et al.  Quantile regression 40 years on , 2017 .

[43]  Mostafa HOSSEINI,et al.  A Comparison between Accelerated Failure-time and Cox Proportional Hazard Models in Analyzing the Survival of Gastric Cancer Patients , 2015, Iranian journal of public health.

[44]  L. Kiemeney,et al.  Obesity, metabolic factors and risk of different histological types of lung cancer: A Mendelian randomization study , 2017, PloS one.

[45]  T. Lancaster,et al.  Bayesian Quantile Regression , 2005 .

[46]  M. Jhun,et al.  Median Regression Model with Interval Censored Data , 2010, Biometrical journal. Biometrische Zeitschrift.