Semiparametric integrative interaction analysis for non-small-cell lung cancer

In genomic analysis, it is significant though challenging to identify markers associated with cancer outcomes or phenotypes. Based on the biological mechanisms of cancers and the characteristics of datasets, we propose a novel integrative interaction approach under a semiparametric model, in which genetic and environmental factors are included as the parametric and nonparametric components, respectively. The goal of this approach is to identify the genetic factors and gene–gene interactions associated with cancer outcomes, while estimating the nonlinear effects of environmental factors. The proposed approach is based on the threshold gradient-directed regularisation technique. Simulation studies indicate that the proposed approach outperforms alternative methods at identifying the main effects and interactions, and has favourable estimation and prediction accuracy. We analysed non-small-cell lung carcinoma datasets from the Cancer Genome Atlas, and the results demonstrate that the proposed approach can identify markers with important implications and that it performs favourably in terms of prediction accuracy, identification stability, and computation cost.

[1]  Yusuke Nakamura,et al.  Variation in TP63 is associated with lung adenocarcinoma susceptibility in Japanese and Korean populations , 2010, Nature Genetics.

[2]  Jian Huang,et al.  Clustering threshold gradient descent regularization: with applications to microarray studies , 2007, Bioinform..

[3]  Pruthi,et al.  Influence of Age on Lung Function Tests , 2012 .

[4]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[5]  Bin Yu,et al.  Three principles of data science: predictability, computability, and stability (PCS) , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[6]  Shuangge Ma,et al.  Variable selection and direction estimation for single-index models via DC-TGDR method , 2018 .

[7]  Young Tae Kim,et al.  Comprehensive analysis of the tumor immune micro-environment in non-small cell lung cancer for efficacy of checkpoint inhibitor , 2018, Scientific Reports.

[8]  S. Hanash,et al.  IL6 Blockade Reprograms the Lung Tumor Microenvironment to Limit the Development and Progression of K-ras-Mutant Lung Cancer. , 2016, Cancer research.

[9]  Yuhong Yang,et al.  Parametric or nonparametric? A parametricness index for model selection , 2011, 1202.0391.

[10]  R. Tibshirani,et al.  A LASSO FOR HIERARCHICAL INTERACTIONS. , 2012, Annals of statistics.

[11]  Jin Liu,et al.  Promoting similarity of model sparsity structures in integrative analysis of cancer genetic data , 2017, Statistics in medicine.

[12]  Jingting Jiang,et al.  Correlation between serum IL-1β and miR-144-3p as well as their prognostic values in LUAD and LUSC patients , 2016, Oncotarget.

[13]  Jian Huang,et al.  Identification of non-Hodgkin's lymphoma prognosis signatures using the CTGDR method , 2010, Bioinform..

[14]  Liquid Biopsies Non-small-cell lung cancer , 2015, Nature Reviews Disease Primers.

[15]  Jianxin Shi,et al.  Developing and evaluating polygenic risk prediction models for stratified disease prevention , 2016, Nature Reviews Genetics.

[16]  T. Guo,et al.  Reduced SLC27A2 induces cisplatin resistance in lung cancer stem cells by negatively regulating Bmi1‐ABCG2 signaling , 2016, Molecular carcinogenesis.

[17]  Bogdan E. Popescu,et al.  Gradient Directed Regularization for Linear Regression and Classi…cation , 2004 .

[18]  Rong Li,et al.  Penalized integrative semiparametric interaction analysis for multiple genetic datasets , 2019, Statistics in medicine.

[19]  Eugene Demidenko,et al.  Multivariate meta-analysis for data consortia, individual patient meta-analysis, and pooling projects , 2008 .

[20]  Rong Li,et al.  Integrative Interaction Analysis using Threshold Gradient Directed Regularization. , 2019, Applied stochastic models in business and industry.

[21]  Bo Wang,et al.  Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities , 2018, Inf. Fusion.

[22]  A. Berg,et al.  NNK reduction pathway gene polymorphisms and risk of lung cancer , 2015, Molecular carcinogenesis.

[23]  Zhen Wang,et al.  Silencing of PYGB suppresses growth and promotes the apoptosis of prostate cancer cells via the NF-κB/Nrf2 signaling pathway , 2018, Molecular medicine reports.

[24]  Jian Huang,et al.  Integrative Analysis of Cancer Diagnosis Studies with Composite Penalization , 2014, Scandinavian journal of statistics, theory and applications.

[25]  Chunling Yuan,et al.  Systematic Analysis of Gene Expression Alteration and Co-Expression Network of Eukaryotic Initiation Factor 4A-3 in Cancer , 2018, Journal of Cancer.

[26]  R. Peto,et al.  Global effects of smoking, of quitting, and of taxing tobacco. , 2014, The New England journal of medicine.

[27]  Jian Huang,et al.  Identifying gene‐gene interactions using penalized tensor regression , 2018, Statistics in medicine.

[28]  I. Petersen,et al.  Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking , 2016, Nature Genetics.