A Single-Subject Method to Detect Pathways Enriched With Alternatively Spliced Genes

RNA-Sequencing data offers an opportunity to enable precision medicine, but most methods rely on gene expression alone. To date, no methodology exists to identify and interpret alternative splicing patterns within pathways for an individual patient. This study develops methodology and conducts computational experiments to test the hypothesis that pathway aggregation of subject-specific alternatively spliced genes (ASGs) can inform upon disease mechanisms and predict survival. We propose the N-of-1-pathways Alternatively Spliced (N1PAS) method that takes an individual patient’s paired-sample RNA-Seq isoform expression data (e.g., tumor vs. non-tumor, before-treatment vs. during-therapy) and pathway annotations as inputs. N1PAS quantifies the degree of alternative splicing via Hellinger distances followed by two-stage clustering to determine pathway enrichment. We provide a clinically relevant “odds ratio” along with statistical significance to quantify pathway enrichment. We validate our method in clinical samples and find that our method selects relevant pathways (p < 0.05 in 4/6 data sets). Extensive Monte Carlo studies show N1PAS powerfully detects pathway enrichment of ASGs while adequately controlling false discovery rates. Importantly, our studies also unveil highly heterogeneous single-subject alternative splicing patterns that cohort-based approaches overlook. Finally, we apply our patient-specific results to predict cancer survival (FDR < 20%) while providing diagnostics in pursuit of translating transcriptome data into clinically actionable information. Software available at https://github.com/grizant/n1pas/tree/master.

[1]  Bradley Efron,et al.  Local False Discovery Rates , 2005 .

[2]  B. Efron Correlation and Large-Scale Simultaneous Significance Testing , 2007 .

[3]  Richard E. Neapolitan Analyzing Gene Expression Data , 2009 .

[4]  Peter Bühlmann,et al.  Analyzing gene expression data in terms of gene sets: methodological issues , 2007, Bioinform..

[5]  C. Perou,et al.  Molecular Subtypes in Breast Cancer Evaluation and Management: Divide and Conquer , 2008, Cancer investigation.

[6]  Yves A Lussier,et al.  Testing for differentially expressed genetic pathways with single-subject N-of-1 data in the presence of inter-gene correlation , 2018, Statistical methods in medical research.

[7]  Mary Goldman,et al.  The UCSC Cancer Genomics Browser: update 2015 , 2014, Nucleic Acids Res..

[8]  Marla Johnson,et al.  Clustering of mRNA-Seq data for detection of alternative splicing patterns , 2015, bioRxiv.

[9]  Sorin Draghici,et al.  MicroRNA-Augmented Pathways (mirAP) and Their Applications to Pathway Analysis and Disease Subtyping , 2017, PSB.

[10]  Taesung Park,et al.  Personalized identification of altered pathways in cancer using accumulated normal tissue data , 2014, Bioinform..

[11]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[12]  M. Ladomery Aberrant Alternative Splicing Is Another Hallmark of Cancer , 2013, International journal of cell biology.

[13]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[14]  I. Langner Survival Analysis: Techniques for Censored and Truncated Data , 2006 .

[15]  Elizabeth Purdom,et al.  Clustering of mRNA‐Seq data based on alternative splicing patterns , 2017, Biostatistics.

[16]  Genevera I. Allen,et al.  TCGA2STAT: simple TCGA data access for integrated statistical analysis in R , 2016, Bioinform..

[17]  Yves A. Lussier,et al.  Dynamic changes of RNA-sequencing expression for precision medicine: N-of-1-pathways Mahalanobis distance within pathways of single subjects predicts breast cancer survival , 2015, Bioinform..

[18]  Yong Huang,et al.  Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer , 2012, PLoS Comput. Biol..

[19]  Yves A. Lussier,et al.  kMEn: Analyzing noisy and bidirectional transcriptional pathway responses in single subjects , 2017, J. Biomed. Informatics.

[20]  Yves A. Lussier,et al.  N-of-1-pathways MixEnrich: advancing precision medicine via single-subject analysis in discovering dynamic changes of transcriptomes , 2017, BMC Medical Genomics.

[21]  Atul J. Butte,et al.  Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges , 2012, PLoS Comput. Biol..

[22]  John G Kenny,et al.  Transcriptome sequencing of human breast cancer reveals aberrant intronic transcription in amplicons and dysregulation of alternative splicing with major therapeutic implications. , 2016, International journal of oncology.

[23]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[24]  Dawei Li,et al.  High-Throughput Transcriptome Profiling in Drug and Biomarker Discovery , 2020, Frontiers in Genetics.

[25]  Minoru Yoshida,et al.  Splicing in oncogenesis and tumor suppression , 2012, Cancer science.

[26]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[27]  Maria Teresa Bustamante-Teixeira,et al.  [Survival analysis techniques]. , 2002, Cadernos de saude publica.

[28]  S. Sugano,et al.  Frequent pathway mutations of splicing machinery in myelodysplasia , 2011, Nature.

[29]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[30]  C. Perou,et al.  Molecular Features and Survival Outcomes of the Intrinsic Subtypes Within HER2-Positive Breast Cancer , 2014, Journal of the National Cancer Institute.

[31]  Mary Goldman,et al.  The UCSC Cancer Genomics Browser: update 2015 , 2014, Nucleic Acids Res..

[32]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  C. Perou,et al.  Molecular portraits and 70-gene prognosis signature are preserved throughout the metastatic process of breast cancer. , 2005, Cancer research.

[34]  Ian T. Foster,et al.  ‘N-of-1-pathways’ unveils personal deregulated mechanisms from a single pair of RNA-Seq samples: towards precision medicine , 2014, J. Am. Medical Informatics Assoc..

[35]  John G Kenny,et al.  Transcriptome sequencing of human breast cancer reveals aberrant intronic transcription in amplicons and dysregulation of alternative splicing with major therapeutic implications , 2017, International journal of oncology.

[36]  R. Tibshirani,et al.  Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Mark T. W. Ebbert,et al.  PAM50 Breast Cancer Subtyping by RT-qPCR and Concordance with Standard Clinical Molecular Markers , 2012, BMC Medical Genomics.