aBioMarVsuit: A Biomarker Validation Suit for predicting Survival using gene signature

Abstract R package aBioMarVsuit can be used to discover predictive gene signature for predicting survival and dividing patients into low or high risk groups. Classifiers are constructed as linear combination of important genes and prognostic factors and treatment effects can be incorporated, if necessary. Several classifiers are implemented along with the validation procedures: majority votes technique and LASSO and Elastic net based classifiers and as function of scores of first PCA or PLS methods. Gene expression matrix is reduced using the dimension reduction methods PLS and PCA, when only scores of the first component are used in the classifier. Sensitivity analysis on the cutoff values used for the classifiers which are based on PCA and PLS can be carried out. Large scale cross validation can be performed in order to investigate the mostly selected genes during the evaluation process and therefore distributions for hazard ratios (HR) for the low risk group can be approximated both on test and training data. The inference is based on resampling methods, permutations in which null distribution of the estimated HR is approximated. Package depends on several other packages mainly, superpc and glmnet.

[1]  Yan Zhou,et al.  A Supervised Approach for Predicting Patient Survival with Gene Expression Data , 2010, 2010 IEEE International Conference on BioInformatics and BioEngineering.

[2]  Jiang Gui,et al.  Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data , 2005, Bioinform..

[3]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[4]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[5]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[6]  R. Tibshirani,et al.  Prediction by Supervised Principal Components , 2006 .

[7]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[8]  Xi Chen,et al.  Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes , 2008, Bioinform..

[9]  R. Tibshirani,et al.  Semi-Supervised Methods to Predict Patient Survival from Gene Expression Data , 2004, PLoS biology.

[10]  Danh V. Nguyen,et al.  Partial least squares proportional hazard regression for application to DNA microarray survival data , 2002, Bioinform..

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[13]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[14]  Wynne W. Chin,et al.  Handbook of Partial Least Squares , 2010 .

[15]  Meland,et al.  THE USE OF MOLECULAR PROFILING TO PREDICT SURVIVAL AFTER CHEMOTHERAPY FOR DIFFUSE LARGE-B-CELL LYMPHOMA , 2002 .