Radiomics Analysis Using Stability Selection Supervised Principal Component Analysis for Right-censored Survival Data

Radiomics is a newly emerging field that involves the extraction of a large number of quantitative features from biomedical images through the use of data-characterization algorithms. Radiomics provides a noninvasive approach for personalized therapy decision by identifying distinctive imaging features for predicting prognosis and therapeutic response. So far, many of the published radiomics studies utilize existing out of the box algorithms to identify the prognostic markers from biomedical images that are not specific to radiomics data. T o better utilize biomedical image, we propose a novel machine learning approach, stability selection supervised principal component analysis (SSSuperPCA) that identify a set of stable features from radiomics big data coupled with dimension reduction for right censored survival outcomes. In this paper, we describe stability selection supervised principal component analysis for radiomics data with right-censored survival outcomes. The proposed approach allows us to identify a set of stable features that are highly associated with the survival outcomes, control the per-family error rate, and predict the survival in a simple yet meaningful manner. We evaluate the performance of SSSuperPCA using simulations and real data sets for non-small cell lung cancer and head and neck cancer, and compare it with other machine learning algorithms. The results demonstrate that our method has a competitive edge over other existing methods in identifying the prognostic markers from biomedical big imaging data for the prediction of right-censored survival outcomes. An R package SSSuperPCA is available at the website: http://web.hku.hk/∼herbpang/SSSuperPCA.html

[1]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[2]  Pol Cirujeda,et al.  A 3-D Riesz-Covariance Texture Model for Prediction of Nodule Recurrence in Lung CT , 2016, IEEE Transactions on Medical Imaging.

[3]  Liang Chen,et al.  Sparse Representation-Based Radiomics for the Diagnosis of Brain Tumors , 2018, IEEE Transactions on Medical Imaging.

[4]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.

[5]  P. Lambin,et al.  Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach , 2014, Nature Communications.

[6]  Martin Sill,et al.  Large-scale Radiomic Profiling of Recurrent Glioblastoma Identifies an Imaging Predictor for Stratifying Anti-Angiogenic Treatment Response , 2016, Clinical Cancer Research.

[7]  Jian Huang,et al.  Regularized ROC method for disease classification and biomarker selection with microarray data , 2005, Bioinform..

[8]  Benjamin Hofner,et al.  Boosting the discriminatory power of sparse survival models via optimization of the concordance index and stability selection , 2016, BMC Bioinformatics.

[9]  Xiaoning Qian,et al.  Supervised categorical principal component analysis for genome-wide association analyses , 2014, BMC Genomics.

[10]  Jerzy Tiuryn,et al.  GWAMAR: Genome-wide assessment of mutations associated with drug resistance in bacteria , 2014, BMC Genomics.

[11]  Jadwiga Borucka,et al.  Extensions of cox model for non-proportional hazards purpose , 2014 .

[12]  I. El Naqa,et al.  A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities , 2015, Physics in medicine and biology.

[13]  M. Pencina,et al.  On the C‐statistics for evaluating overall adequacy of risk prediction procedures with censored survival data , 2011, Statistics in medicine.

[14]  R. Tibshirani,et al.  Prediction by Supervised Principal Components , 2006 .

[15]  Qianjin Feng,et al.  Identification of topological features in renal tumor microenvironment associated with patient survival , 2017, Bioinform..

[16]  Samuel Kadoury,et al.  3-D Morphology Prediction of Progressive Spinal Deformities From Probabilistic Modeling of Discriminant Manifolds , 2017, IEEE Transactions on Medical Imaging.

[17]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[18]  Elizaveta Levina,et al.  Discussion of "Stability selection" by N. Meinshausen and P. Buhlmann , 2010 .

[19]  P. Lambin,et al.  Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach , 2014, Nature Communications.

[20]  K. Sultanem,et al.  TH-A-WAB-02: FDG-PET Imaging Features Can Predict Treatment Outcomes in Head and Neck Cancer. , 2013, Medical physics.

[21]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[22]  Xi Chen,et al.  Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes , 2008, Bioinform..

[23]  Yanqi Huang,et al.  Radiomics Signature: A Potential Biomarker for the Prediction of Disease-Free Survival in Early-Stage (I or II) Non-Small Cell Lung Cancer. , 2016, Radiology.

[24]  Steffen Löck,et al.  A comparative study of machine learning methods for time-to-event survival data for radiomics risk modelling , 2017, Scientific Reports.

[25]  El Naqa,et al.  A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities , 2015 .

[26]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[27]  Rajen Dinesh Shah,et al.  Variable selection with error control: another look at stability selection , 2011, 1105.5578.

[28]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[29]  D.,et al.  Regression Models and Life-Tables , 2022 .

[30]  Hongyu Zhao,et al.  Pathway analysis using random forests with bivariate node-split for survival outcomes , 2010, Bioinform..