Platform-Independent Gene-Expression Based Classification-System for Molecular Sub-typing of Cancer

Molecular stratification of cancer patients is driving the development of precision medicine-targeted therapies. Clustering of tumor samples based on gene expression profiles from high-throughput platforms, such as microarrays or NextGen sequencing, has resulted in distinct tumor subtypes for numerous cancers. However, the majority of the derived classifiers or gene signatures have not reached clinical utility. Therefore, informatics methods to accurately translate the derived gene-signature from the high-throughput platform to a clinically adaptable low-dimensional platform are critical. In this chapter, we discuss a workflow to derive and then transfer gene signatures from one analytical platform to another for cancer patient stratification. We summarize the results of the workflow on two different cancers. Finally we discuss the importance of data-discretization in dealing with the cross-platform data and incorporating the splice-variant or isoform-level gene expression profiles in the statistical analyses.

[1]  A. Brunati,et al.  MBNL142 and MBNL143 gene isoforms, overexpressed in DM1-patient muscle, encode for nuclear proteins interacting with Src family kinases , 2013, Cell Death and Disease.

[2]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[3]  T. Maniatis,et al.  An RNA-Sequencing Transcriptome and Splicing Database of Glia, Neurons, and Vascular Cells of the Cerebral Cortex , 2014, The Journal of Neuroscience.

[4]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[5]  P. Grambsch,et al.  Modeling Survival Data: Extending the Cox Model , 2000 .

[6]  S. Gabriel,et al.  Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. , 2010, Cancer cell.

[7]  Giuseppe Saglio,et al.  First-line therapy for chronic myeloid leukemia: new horizons and an update. , 2010, Clinical lymphoma, myeloma & leukemia.

[8]  R. Davuluri,et al.  Alternative transcription and alternative splicing in cancer. , 2012, Pharmacology & therapeutics.

[9]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[10]  Yingtao Bi,et al.  Isoform level expression profiles provide better cancer signatures than gene level expression profiles , 2013, Genome Medicine.

[11]  R. Davuluri,et al.  Platform-Independent Classification System to Predict Molecular Subtypes of High-Grade Serous Ovarian Carcinoma , 2019, JCO clinical cancer informatics.

[12]  Lynda Chin,et al.  Emerging insights into the molecular and cellular basis of glioblastoma. , 2012, Genes & development.

[13]  J. Venables Unbalanced alternative splicing and its significance in cancer , 2006, BioEssays : news and reviews in molecular, cellular and developmental biology.

[14]  Sherif Abou Elela,et al.  Cancer-associated regulation of alternative splicing , 2009, Nature Structural &Molecular Biology.

[15]  P. Grabowski Alternative splicing takes shape during neuronal development. , 2011, Current opinion in genetics & development.

[16]  M. Steinberg Dasatinib: a tyrosine kinase inhibitor for the treatment of chronic myelogenous leukemia and philadelphia chromosome-positive acute lymphoblastic leukemia. , 2007, Clinical therapeutics.

[17]  Ramón Díaz-Uriarte,et al.  GeneSrF and varSelRF: a web-based tool and R package for gene selection and classification using random forest , 2007, BMC Bioinformatics.

[18]  B. Ebert,et al.  Mutations in RNA splicing machinery in human cancers. , 2011, New England Journal of Medicine.

[19]  S. Stamm,et al.  Alternative splicing and disease. , 2009, Biochimica et biophysica acta.

[20]  Alex Lewin,et al.  MMBGX: a method for estimating expression at the isoform level and detecting differential splicing using whole-transcript Affymetrix arrays , 2009, Nucleic acids research.

[21]  Gilbert S. Omenn,et al.  Alternative Splice Variants, a New Class of Protein Cancer Biomarker Candidates: Findings in Pancreatic Cancer and Breast Cancer with Systems Biology Implications , 2010, Disease markers.

[22]  M. Wilkins,et al.  Whole Transcriptome Sequencing Reveals Gene Expression and Splicing Differences in Brain Regions Affected by Alzheimer's Disease , 2011, PloS one.

[23]  H. Schild,et al.  Chromium-picolinate therapy in diabetes care: molecular and subcellular profiling revealed a necessity for individual outcome prediction, personalised treatment algorithms and new guidelines. , 2011, Infectious disorders drug targets.

[24]  M. Gerstein,et al.  What is a gene, post-ENCODE? History and updated definition. , 2007, Genome research.

[25]  Rebecca L. Siegel Mph,et al.  Cancer statistics, 2016 , 2016 .

[26]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[27]  D. Louis,et al.  Diagnostic and therapeutic avenues for glioblastoma: no longer a dead end? , 2013, Nature Reviews Clinical Oncology.

[28]  Lior Pachter,et al.  Exon-Level Microarray Analyses Identify Alternative Splicing Programs in Breast Cancer , 2010, Molecular Cancer Research.

[29]  Hyunsoo Kim,et al.  Estimating the Expression of Transcript Isoforms from mRNA-Seq via Nonnegative Least Squares , 2010, 2010 IEEE International Conference on BioInformatics and BioEngineering.

[30]  Thomas A. Sellers,et al.  Epidemiology of ovarian cancer: a review , 2017, Cancer biology & medicine.

[31]  R. Tothill,et al.  Novel Molecular Subtypes of Serous and Endometrioid Ovarian Cancer Linked to Clinical Outcome , 2008, Clinical Cancer Research.

[32]  Hyunsoo Kim,et al.  the transcriptome diversity of cerebellar development Alternative transcription exceeds alternative splicing in generating Material Supplemental , 2011 .

[33]  Joel H. Saltz,et al.  BMC Systems Biology , 2022 .

[34]  Yingtao Bi,et al.  Evaluation of data discretization methods to derive platform independent isoform expression signatures for multi-class tumor subtyping , 2015, BMC Genomics.

[35]  L. Deangelis,et al.  Glioblastoma: molecular analysis and clinical implications. , 2013, Annual review of medicine.

[36]  Matthew D. Wilkerson,et al.  ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking , 2010, Bioinform..

[37]  J. Bourdon,et al.  p53 Isoforms: An Intracellular Microprocessor? , 2011, Genes & cancer.

[38]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[39]  Ramana V. Davuluri,et al.  Comparative evaluation of isoform-level gene expression estimation algorithms for RNA-seq and exon-array platforms , 2016, Briefings Bioinform..

[40]  Lior Pachter,et al.  Near-optimal probabilistic RNA-seq quantification , 2016, Nature Biotechnology.

[41]  Fabian Birzele,et al.  CD44 Isoform Status Predicts Response to Treatment with Anti-CD44 Antibody in Cancer Patients , 2015, Clinical Cancer Research.

[42]  Huan Liu,et al.  Discretization: An Enabling Technique , 2002, Data Mining and Knowledge Discovery.

[43]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[44]  Charity W. Law,et al.  voom: precision weights unlock linear model analysis tools for RNA-seq read counts , 2014, Genome Biology.

[45]  B. Blencowe,et al.  Global Profiling and Molecular Characterization of Alternative Splicing Events Misregulated in Lung Cancer , 2010, Molecular and Cellular Biology.

[46]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[47]  Luke Macyszyn,et al.  Isoform-level gene signature improves prognostic stratification and accurately classifies glioblastoma subtypes , 2014, Nucleic acids research.

[48]  Ramana V. Davuluri,et al.  Annotation of gene promoters by integrative data-mining of ChIP-seq Pol-II enrichment data , 2010, BMC Bioinformatics.

[49]  D. Hayes,et al.  Gene expression profiling of gliomas: merging genomic and histopathological classification for personalised therapy , 2010, British Journal of Cancer.

[50]  C. Ustun,et al.  Nilotinib: a second-generation tyrosine kinase inhibitor for the treatment of chronic myelogenous leukemia. , 2008, Clinical therapeutics.

[51]  A. Jemal,et al.  Cancer statistics, 2016 , 2016, CA: a cancer journal for clinicians.

[52]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.

[53]  Youping Deng,et al.  Gene selection and classification for cancer microarray data based on machine learning and similarity measures , 2011, BMC Genomics.

[54]  R. Skotheim,et al.  Alternative splicing in cancer: noise, functional, or systematic? , 2007, The international journal of biochemistry & cell biology.