A User-Friendly, Web-Based Integrative Tool (ESurv) for Survival Analysis: Development and Validation Study

Background Prognostic genes or gene signatures have been widely used to predict patient survival and aid in making decisions pertaining to therapeutic actions. Although some web-based survival analysis tools have been developed, they have several limitations. Objective Taking these limitations into account, we developed ESurv (Easy, Effective, and Excellent Survival analysis tool), a web-based tool that can perform advanced survival analyses using user-derived data or data from The Cancer Genome Atlas (TCGA). Users can conduct univariate analyses and grouped variable selections using multiomics data from TCGA. Methods We used R to code survival analyses based on multiomics data from TCGA. To perform these analyses, we excluded patients and genes that had insufficient information. Clinical variables were classified as 0 and 1 when there were two categories (for example, chemotherapy: no or yes), and dummy variables were used where features had 3 or more outcomes (for example, with respect to laterality: right, left, or bilateral). Results Through univariate analyses, ESurv can identify the prognostic significance for single genes using the survival curve (median or optimal cutoff), area under the curve (AUC) with C statistics, and receiver operating characteristics (ROC). Users can obtain prognostic variable signatures based on multiomics data from clinical variables or grouped variable selections (lasso, elastic net regularization, and network-regularized high-dimensional Cox-regression) and select the same outputs as above. In addition, users can create custom gene signatures for specific cancers using various genes of interest. One of the most important functions of ESurv is that users can perform all survival analyses using their own data. Conclusions Using advanced statistical techniques suitable for high-dimensional data, including genetic data, and integrated survival analysis, ESurv overcomes the limitations of previous web-based tools and will help biomedical researchers easily perform complex survival analyses.

[1]  Dae Cheon Jeong,et al.  Development of a risk scoring system for patients with papillary thyroid cancer , 2019, Journal of cellular and molecular medicine.

[2]  Chirayu Pankaj Goswami,et al.  PROGgeneV2: enhancements on the existing database , 2014, BMC Cancer.

[3]  A. Zwinderman,et al.  Validation of prediction models based on lasso regression with multiply imputed data , 2014, BMC Medical Research Methodology.

[4]  Dae Cheon Jeong,et al.  Prognostic scoring system for osteosarcoma using network‐regularized high‐dimensional Cox‐regression analysis and potential therapeutic targets , 2019, Journal of cellular physiology.

[5]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[6]  Igor Jurisica,et al.  Three-gene prognostic classifier for early-stage non small-cell lung cancer. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[7]  F. Bertucci,et al.  Prediction of Early Breast Cancer Metastasis from DNA Microarray Data Using High-Dimensional Cox Regression Models , 2015, Cancer informatics.

[8]  Dae Cheon Jeong,et al.  Gene network inherent in genomic big data improves the accuracy of prognostic prediction for cancer patients , 2017, Oncotarget.

[9]  Jung Eun Lee,et al.  Sex- and gender-specific disparities in colorectal cancer risk. , 2015, World journal of gastroenterology.

[10]  Yichao Wu ELASTIC NET FOR COX'S PROPORTIONAL HAZARDS MODEL WITH A SOLUTION PATH ALGORITHM. , 2012, Statistica Sinica.

[11]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[12]  Suyan Tian,et al.  Identification of Subtype-Specific Prognostic Genes for Early-Stage Lung Adenocarcinoma and Squamous Cell Carcinoma Patients Using an Embedded Feature Selection Algorithm , 2015, PloS one.

[13]  Rui Feng,et al.  NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA. , 2014, Statistica Sinica.

[14]  Clemens Scott Kruse,et al.  Hospital-Based Back Surgery: Geospatial-Temporal, Explanatory, and Predictive Models , 2019, Journal of medical Internet research.

[15]  Atul J Butte,et al.  Collaborative Biomedicine in the Age of Big Data: The Case of Cancer , 2014, Journal of medical Internet research.

[16]  M. Pencina,et al.  On the C‐statistics for evaluating overall adequacy of risk prediction procedures with censored survival data , 2011, Statistics in medicine.

[17]  S. Imbeaud,et al.  A hepatocellular carcinoma 5-gene score associated with survival of patients after liver resection. , 2013, Gastroenterology.

[18]  H. Lippert,et al.  Comparison of 17,641 Patients With Right- and Left-Sided Colon Cancer: Differences in Epidemiology, Perioperative Course, Histology, and Survival , 2010, Diseases of the colon and rectum.

[19]  Jian Lu,et al.  Prognostic and predictive value of a microRNA signature in stage II colon cancer: a microRNA expression analysis. , 2013, The Lancet. Oncology.

[20]  Kenta Nakai,et al.  PrognoScan: a new database for meta-analysis of the prognostic value of genes , 2009, BMC Medical Genomics.

[21]  Judith R. Logan,et al.  Race, ethnicity, and sex affect risk for polyps >9 mm in average-risk individuals. , 2014, Gastroenterology.

[22]  D. Bennett How can I deal with missing data in my study? , 2001, Australian and New Zealand journal of public health.

[23]  Chirayu Pankaj Goswami,et al.  PROGgene: gene expression based survival analysis web application for multiple cancers , 2013, Journal of Clinical Bioinformatics.

[24]  David E. Fisher,et al.  Precision medicine for cancer with next-generation functional diagnostics , 2015, Nature Reviews Cancer.

[25]  Jeremy J. W. Chen,et al.  A five-gene signature and clinical outcome in non-small-cell lung cancer. , 2007, The New England journal of medicine.

[26]  Ji-Young Kim,et al.  SAC3D1: a novel prognostic marker in hepatocellular carcinoma , 2018, Scientific Reports.

[27]  A. Martínez-Torteya,et al.  SurvExpress: An Online Biomarker Validation Tool and Database for Cancer Gene Expression Data Using Survival Analysis , 2013, PloS one.

[28]  Adrian V. Lee,et al.  An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics , 2018, Cell.