Predictive modeling of lung cancer recurrence using alternative splicing events versus differential expression data

Lung cancer is the leading cause of cancer-related deaths worldwide. Biomarker discovery has become increasingly important for the effective diagnosis, prognosis and treatment of the disease. The analysis of differential gene expression data has been the primary method for biomarker discovery. Our research demonstrates that alternative splicing events (ASE) can be another source of data for predictive model creation by identifying putative biomarkers that are complementary to those found from traditional gene expression. RNASeq data from 21 patients diagnosed with lung adenocarcinoma, a non-small cell lung carcinoma (11 of which relapsed) were analyzed. After quantifying splice variants and gene expression with a bioinformatics pipeline, we were able to create predictive models, using orthogonal projections to latent structures discriminate analysis (OPLS-DA) that recognize two clinical phenotypes (disease free and relapse); thus distinguishing between more indolent and aggressive disease. Hierarchical clustering of samples pre and post predictive model feature selection showed that clustering based on ASE was more indicative of the relapse phenotype. A novel hybrid multiple objective genetic algorithm combining alternative splicing events with gene expression was used for discriminate feature selection. A post-processing examination of the putative biomarkers found by the genetic algorithm and ranked correlation tests demonstrate that the analysis of alternative splicing events provide complementary and non-redundant predictive power by identifying biologically relevant patterns that do not result in differential gene expression.

[1]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[2]  Wei Zhang,et al.  Genetic and epigenetic changes in lung carcinoma and their clinical implications , 2011, Modern Pathology.

[3]  J. Manley,et al.  Alternative pre-mRNA splicing regulation in cancer: pathways and programs unhinged. , 2010, Genes & development.

[4]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[5]  Cole Trapnell,et al.  Role of Rodent Secondary Motor Cortex in Value-based Action Selection Nih Public Access Author Manuscript , 2006 .

[6]  T. Godfrey,et al.  Whole genome exon arrays identify differential expression of alternatively spliced, cancer-related genes in lung cancer , 2008, Nucleic acids research.

[7]  M. Roizen Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening , 2012 .

[8]  Mark Stitt,et al.  RobiNA: a user-friendly, integrated software solution for RNA-Seq-based transcriptomics , 2012, Nucleic Acids Res..

[9]  Shiyam Kumar,et al.  Customised, Individualised Treatment of Metastatic Non-Small-Cell Lung Carcinoma (NSCLC). , 2013, Sultan Qaboos University medical journal.

[10]  Daniel J. Blankenberg,et al.  Galaxy: A Web‐Based Genome Analysis Tool for Experimentalists , 2010, Current protocols in molecular biology.

[11]  Colin N. Dewey,et al.  Inference of alternative splicing from RNA-Seq data with probabilistic splice graphs , 2013, Bioinform..

[12]  M. Rantalainen,et al.  OPLS discriminant analysis: combining the strengths of PLS‐DA and SIMCA classification , 2006 .

[13]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[14]  N. Girard,et al.  New driver mutations in non-small-cell lung cancer. , 2011, The Lancet. Oncology.

[15]  Á. Rubio,et al.  Development of a novel splice array platform and its application in the identification of alternative splice variants in lung cancer , 2010, BMC Genomics.

[16]  S. Wold,et al.  Orthogonal projections to latent structures (O‐PLS) , 2002 .

[17]  E. Wang,et al.  Analysis and design of RNA sequencing experiments for identifying isoform regulation , 2010, Nature Methods.

[18]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[19]  Angela N. Brooks,et al.  Mapping the Hallmarks of Lung Adenocarcinoma with Massively Parallel Sequencing , 2012, Cell.

[20]  Florian Sohler,et al.  Exon Array Analysis using re-defined probe sets results in reliable identification of alternatively spliced genes in non-small cell lung cancer , 2010, BMC Genomics.

[21]  Krishna R. Kalari,et al.  Deep Sequence Analysis of Non-Small Cell Lung Cancer: Integrated Analysis of Gene Expression, Alternative Splicing, and Single Nucleotide Variations in Lung Adenocarcinomas with and without Oncogenic KRAS Mutations , 2012, Front. Oncol..

[22]  B. Blencowe,et al.  Global Profiling and Molecular Characterization of Alternative Splicing Events Misregulated in Lung Cancer , 2010, Molecular and Cellular Biology.

[23]  W. Pao,et al.  Chipping away at the lung cancer genome , 2012, Nature Medicine.

[24]  Illinois.,et al.  Cancer Genetics , 1976, British Journal of Cancer.

[25]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[26]  Li Ding,et al.  Genomic Landscape of Non-Small Cell Lung Cancer in Smokers and Never-Smokers , 2012, Cell.

[27]  Steven J. M. Jones,et al.  Alternative expression analysis by RNA sequencing , 2010, Nature Methods.

[28]  Mats Lambe,et al.  Biomarker Discovery in Non–Small Cell Lung Cancer: Integrating Gene Expression Profiling, Meta-analysis, and Tissue Microarray Validation , 2012, Clinical Cancer Research.

[29]  Edward J. O'Brien,et al.  Genome-scale models of metabolism and gene expression extend and refine growth phenotype prediction , 2013, Molecular systems biology.

[30]  William F. Punch,et al.  Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[31]  Gene W. Yeo,et al.  Variation in alternative splicing across human tissues , 2004, Genome Biology.

[32]  Jinze Liu,et al.  DiffSplice: the genome-wide detection of differential splicing events with RNA-seq , 2012 .

[33]  Jie Wu,et al.  SpliceTrap: a method to quantify alternative splicing under single cellular conditions , 2011, Bioinform..