Penalized Ordinal Regression Methods for Predicting Stage of Cancer in High-Dimensional Covariate Spaces

The pathological description of the stage of a tumor is an important clinical designation and is considered, like many other forms of biomedical data, an ordinal outcome. Currently, statistical methods for predicting an ordinal outcome using clinical, demographic, and high-dimensional correlated features are lacking. In this paper, we propose a method that fits an ordinal response model to predict an ordinal outcome for high-dimensional covariate spaces. Our method penalizes some covariates (high-throughput genomic features) without penalizing others (such as demographic and/or clinical covariates). We demonstrate the application of our method to predict the stage of breast cancer. In our model, breast cancer subtype is a nonpenalized predictor, and CpG site methylation values from the Illumina Human Methylation 450K assay are penalized predictors. The method has been made available in the ordinalgmifs package in the R programming environment.

[1]  J. Herman,et al.  Aberrant methylation of gene promoters in cancer---concepts, misconcepts, and promise. , 2000, Journal of the National Cancer Institute.

[2]  Ramil N. Nurtdinov,et al.  Probe-Level Universal Search (Plus) Algorithm for Gender Differentiation in Affymetrix Datasets , 2010, J. Bioinform. Comput. Biol..

[3]  M. Ehrlich,et al.  DNA methylation in cancer: too much, but also too little , 2002, Oncogene.

[4]  S. Rha,et al.  Identification of GABRA1 and LAMA2 as new DNA methylation markers in colorectal cancer. , 2011, International journal of oncology.

[5]  Per Kogner,et al.  Stage-dependent expression of PI3K/Akt‑pathway genes in neuroblastoma. , 2013, International journal of oncology.

[6]  Thomas Yee,et al.  VGAM: Vector Generalized Linear and Additive Models 1.0-4 , 2017 .

[7]  Peter A. Jones,et al.  Cancer-epigenetics comes of age , 1999, Nature Genetics.

[8]  M. Wiench,et al.  [Gene expression profile of medullary thyroid carcinoma--preliminary results]. , 2006, Endokrynologia Polska.

[9]  G. Calaf,et al.  Allelic loss at chromosome 11q13 alters FGF3 gene expression in a human breast cancer progression model. , 2014, Oncology reports.

[10]  Xin Zhou,et al.  A statistical framework for Illumina DNA methylation arrays , 2010, Bioinform..

[11]  C. Sotiriou,et al.  Evaluation of the Infinium Methylation 450K technology. , 2011, Epigenomics.

[12]  M. Maggiolini,et al.  SLC37A1 Gene expression is up-regulated by epidermal growth factor in breast cancer cells , 2010, Breast Cancer Research and Treatment.

[13]  T. Sugimura,et al.  Identification of 20 genes aberrantly methylated in human breast cancers , 2005, International journal of cancer.

[14]  S. Pfister,et al.  Genome-wide molecular characterization of central nervous system primitive neuroectodermal tumor and pineoblastoma. , 2011, Neuro-oncology.

[15]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[16]  Rafael A. Irizarry,et al.  Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays , 2014, Bioinform..

[17]  J. Mefford,et al.  Stromal Genes Add Prognostic Information to Proliferation and Histoclinical Markers: A Basis for the Next Generation of Breast Cancer Gene Signatures , 2012, PloS one.

[18]  A. Wellstein,et al.  The nuclear coactivator Amplified In Breast Cancer 1 maintains tumor initiating cells during development of Ductal Carcinoma In Situ , 2013, Oncogene.

[19]  J. Herman,et al.  Aberrant patterns of DNA methylation, chromatin formation and gene expression in cancer. , 2001, Human molecular genetics.

[20]  黒住 昌史 Breast cancer subtype , 2009 .

[21]  Elmar Bucher,et al.  Genome‐wide analysis identifies 16q deletion associated with survival, molecular subtypes, mRNA expression, and germline haplotypes in breast cancer patients , 2008, Genes, chromosomes & cancer.

[22]  A. Schneeweiss,et al.  DNA methylation array analyses identified breast cancer‐associated HYAL2 methylation in peripheral blood , 2015, International journal of cancer.

[23]  K. Gunderson,et al.  High density DNA methylation array with single CpG site resolution. , 2011, Genomics.

[24]  J. Herman,et al.  A bird's eye view of global methylation , 2000, Nature Genetics.

[25]  P. Zawierucha,et al.  Microarray-based detection and expression analysis of ABC and SLC transporters in drug-resistant ovarian cancer cell lines. , 2013, Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie.

[26]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[27]  Jiayi Hou,et al.  ordinalgmifs: An R Package for Ordinal Regression in High-dimensional Data Settings , 2014, Cancer informatics.

[28]  J. Welsh,et al.  Comparative regulation of gene expression by 1,25-dihydroxyvitamin D3 in cells derived from normal mammary tissue and breast cancer , 2015, The Journal of Steroid Biochemistry and Molecular Biology.

[29]  Neil Kaplowitz,et al.  Formulation and application of a numerical scoring system for assessing histological activity in asymptomatic chronic active hepatitis , 1981, Hepatology.

[30]  R. Tibshirani,et al.  Forward stagewise regression and the monotone lasso , 2007, 0705.0269.

[31]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .