Pathway-Informed Classification System (PICS) for Cancer Analysis Using Gene Expression Data

We introduce Pathway-Informed Classification System (PICS) for classifying cancers based on tumor sample gene expression levels. PICS is a computational method capable of expeditiously elucidating both known and novel biological pathway involvement specific to various cancers and uses that learned pathway information to separate patients into distinct classes. The method clearly separates a pan-cancer dataset by tissue of origin and also sub-classifies individual cancer datasets into distinct survival classes. Gene expression values are collapsed into pathway scores that reveal which biological activities are most useful for clustering cancer cohorts into subtypes. Variants of the method allow it to be used on datasets that do and do not contain noncancerous samples. Activity levels of all types of pathways, broadly grouped into metabolic, cellular processes and signaling, and immune system, are useful for separating the pan-cancer cohort. In the clustering of specific cancer types, certain pathway types become more valuable depending on the site being studied. For lung cancer, signaling pathways dominate; for pancreatic cancer, signaling and metabolic pathways dominate; and for melanoma, immune system pathways are the most useful. This work suggests the utility of pathway-level genomic analysis and points in the direction of using pathway classification for predicting the efficacy and side effects of drugs and radiation.

[1]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[2]  Sabah Jassim,et al.  A MATLAB tool for pathway enrichment using a topology-based pathway regulation score , 2014, BMC Bioinformatics.

[3]  S. Horvath,et al.  Gene expression analysis of glioblastomas identifies the major molecular basis for the prognostic benefit of younger age , 2008, BMC Medical Genomics.

[4]  Laurent Ozbun,et al.  A gene signature predicting for survival in suboptimally debulked patients with ovarian cancer. , 2008, Cancer research.

[5]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[6]  Judy M. Anderson,et al.  A Six-Gene Signature Predicts Survival of Patients with Localized Pancreatic Ductal Adenocarcinoma , 2010, PLoS medicine.

[7]  David Haussler,et al.  Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM , 2010, Bioinform..

[8]  S. Vacher,et al.  ATM has a major role in the double-strand break repair pathway dysregulation in sporadic breast carcinomas and is an independent prognostic marker at both mRNA and protein levels , 2015, British Journal of Cancer.

[9]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[10]  Steven J. M. Jones,et al.  Genomic Classification of Cutaneous Melanoma , 2015, Cell.

[11]  Igor Jurisica,et al.  Validation of a Histology-Independent Prognostic Gene Signature for Early-Stage, Non–Small-Cell Lung Cancer Including Stage IA Patients , 2014, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[12]  L. Miller,et al.  Systems biology approach to studying proliferation-dependent prognostic subnetworks in breast cancer , 2015, Scientific Reports.

[13]  Lincoln Stein,et al.  Reactome: a knowledgebase of biological pathways , 2004, Nucleic Acids Res..

[14]  Alexander R. Pico,et al.  WikiPathways: Pathway Editing for the People , 2008, PLoS biology.

[15]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[16]  Teri A. Manolio,et al.  Bringing genome-wide association findings into clinical use , 2013, Nature Reviews Genetics.

[17]  Ash A. Alizadeh,et al.  Abstract PR09: The prognostic landscape of genes and infiltrating immune cells across human cancers , 2015 .

[18]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[19]  Asoke K. Nandi,et al.  Integrative Cluster Analysis in Bioinformatics , 2015 .

[20]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[21]  Monica L. Mo,et al.  Global reconstruction of the human metabolic network based on genomic and bibliomic data , 2007, Proceedings of the National Academy of Sciences.

[22]  P. Marsden,et al.  False Discovery Rates in PET and CT Studies with Texture Features: A Systematic Review , 2015, PloS one.

[23]  Christopher G. Chute,et al.  Cancer Informatics , 2002, Health Informatics.

[24]  R. Advani,et al.  Gene expression-based model using formalin-fixed paraffin-embedded biopsies predicts overall survival in advanced-stage classical Hodgkin lymphoma. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[25]  A. Olshen,et al.  Down-regulation of stem cell genes, including those in a 200-kb gene cluster at 12p13.31, is associated with in vivo differentiation of human male germ cell tumors. , 2006, Cancer research.

[26]  Michal Sheffer,et al.  Pathway-based personalized analysis of cancer , 2013, Proceedings of the National Academy of Sciences.

[27]  Pooja Mittal,et al.  A novel signaling pathway impact analysis , 2009, Bioinform..

[28]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[29]  Charles DeLisi,et al.  Pathway-based classification of cancer subtypes , 2012, Biology Direct.

[30]  B. Karlan,et al.  Gene expression profile of BRCAness that correlates with responsiveness to chemotherapy and with outcome in patients with epithelial ovarian cancer. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[31]  L. Pusztai,et al.  Cancer heterogeneity: implications for targeted therapeutics , 2013, British Journal of Cancer.