Evaluate Cutpoints: Adaptable continuous data distribution system for determining survival in Kaplan-Meier estimator

BACKGROUND AND OBJECTIVE Growing evidence of transcriptional and metabolomic differentiation induced many studies which analyze such differentiation in context of outcome of disease progression, treatment or influence of many different factors affecting cellular and tissue metabolism. Particularly, cancer researchers are looking for new biomarkers that can serve as a diagnostic/prognostic factor and its further corresponding relationship regarding clinical effects. As a result of the increasing interest in use of dichotomization of continuous variables involving clinical or epidemiological data (gene expression, biomarkers, biochemical parameters, etc.) there is a large demand for cutoff point determination tools with simultaneous lack of software offering stratification of patients based on continuous and binary variables. Therefore, we developed "Evaluate Cutpoints" application offering wide set of statistical and graphical methods for cutpoint optimization enabling stratification of population into two or three groups. METHODS Application is based on R language including algorithms of packages such as survival, survMisc, OptimalCutpoints, maxstat, Rolr, ggplot2, GGally and plotly offering Kaplan-Meier plots and ROC curves with cutoff point determination. RESULTS All capabilities of Evaluate Cutpoints were illustrated with example analysis of estrogen, progesterone and human epidermal growth factor 2 receptors in breast cancer cohort. Through ROC curve the cutoff points were established for expression of ESR1, PGR and ERBB2 in correlation with their immunohistochemical status (cutoff: 1301.253, 243.35, 11,434.438, respectively; sensitivity: 94%, 85%, 64%, respectively; specificity: 93%, 86%, 91%, respectively). Through disease-free survival analysis we divided patients into two and three groups regarding expression of ESR1, PGR and ERBB2. Example algorithm cutp showed that lowered expression of ESR1 and ERBB2 was more favorable (HR = 2.07, p = 0.0412; HR = 2.79, p = 0.0777, respectively), whereas heightened PGR expression was correlated with better prognosis (HR = 0.192, p = 0.0115). CONCLUSIONS This work presents application Evaluate Cutpoints that is freely available to download at http://wnbikp.umed.lodz.pl/Evaluate-Cutpoints/. Currently, many softwares are used to split continuous variables such as Cutoff Finder and X-Tile, which offer distinct algorithms. Unlike them, Evaluate Cutpoints allows not only dichotomization of populations into groups according to continuous variables and binary variables, but also stratification into three groups as well as manual selection of cutoff point thus preventing potential loss of information.

[1]  A. Worster,et al.  Understanding receiver operating characteristic (ROC) curves. , 2006, CJEM.

[2]  Carsten Denkert,et al.  Cutoff Finder: A Comprehensive and Straightforward Web Application Enabling Rapid Biomarker Cutoff Optimization , 2012, PloS one.

[3]  Benjamin E. Gross,et al.  The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. , 2012, Cancer discovery.

[4]  Torsten Hothorn,et al.  On the Exact Distribution of Maximally Selected Rank Statistics , 2002, Comput. Stat. Data Anal..

[5]  Chirayu Pankaj Goswami,et al.  PROGgene: gene expression based survival analysis web application for multiple cancers , 2013, Journal of Clinical Bioinformatics.

[6]  P. Kapur,et al.  A CpG-methylation-based assay to predict survival in clear cell renal cell carcinoma , 2015, Nature Communications.

[7]  D. Rimm,et al.  X-Tile , 2004, Clinical Cancer Research.

[8]  Carmen Cadarso-Suárez,et al.  OptimalCutpoints: An R Package for Selecting Optimal Cutpoints in Diagnostic Tests , 2014 .

[9]  J. Santos-Juanes,et al.  Establishing cut-off points with clinical relevance for bcl-2, cyclin D1, p16, p21, p27, p53, Sox11 and WT1 expression in glioblastoma - a short report , 2018, Cellular Oncology.

[10]  Madhu Mazumdar,et al.  Methods for categorizing a prognostic variable in a multivariable setting , 2003, Statistics in medicine.

[11]  M Mazumdar,et al.  Categorizing a prognostic variable: review of methods, code for easy implementation and applications to decision-making about cancer treatments. , 2000, Statistics in medicine.

[12]  Benjamin E. Gross,et al.  Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal , 2013, Science Signaling.

[13]  Luzia Gonçalves,et al.  ROC curve estimation: An overview , 2014 .

[14]  John O'Quigley,et al.  An application of changepoint methods in studying the effect of age on survival in breast cancer , 1999 .

[15]  A. Tabarin,et al.  DNA Methylation Is an Independent Prognostic Marker of Survival in Adrenocortical Cancer. , 2016, The Journal of clinical endocrinology and metabolism.