A network-based computational framework to predict and differentiate functions for gene isoforms using exon-level expression data.

Motivation Alternative splicing makes significant contributions to functional diversity of transcripts and proteins. Many alternatively spliced gene isoforms have been shown to perform specific biological functions under different contexts. In addition to gene-level expression, the advances of high-throughput sequencing offer a chance to estimate isoform-specific exon expression with a high resolution, which is informative for studying splice variants with network analysis. RESULTS: In this study, we propose a novel network-based analysis framework to predict isoform-specific functions from exon-level RNA-Seq data. In particular, based on exon-level expression data, we firstly propose a unified framework, referred to as Iso-Net, to integrate two new mathematical methods (named MINet and RVNet) that infer co-expression networks at different data scenarios. We demonstrate the superior prediction accuracy of Iso-Net over the existing methods for most simulation data, especially in two extreme cases: sample size is very small and exon numbers of two isoforms are quite different. Furthermore, by defining relevant quantitative measures (e.g., Jaccard correlation coefficient) and combining differential co-expression network analysis and GO functional enrichment analysis, a co-expression network analysis framework is developed to predict functions of isoforms and further, to discover their distinct functions within the same gene. We apply Iso-Net to study gene isoforms for several important transcription factors in human myeloid differentiation with the exon-level RNA-Seq data from three different cell lines. Availability and Implementation Iso-Net is open source and freely available from https://github.com/Dingjie-Wang/Iso-Net.

[1]  Sergey Koren,et al.  Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome , 2019, Nature Biotechnology.

[2]  R. DeKoter,et al.  The Transcription Factor PU.1 is a Critical Regulator of Cellular Communication in the Immune System , 2011, Archivum Immunologiae et Therapiae Experimentalis.

[3]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[4]  A. Regev,et al.  Scaling single-cell genomics from phenomenology to mechanism , 2017, Nature.

[5]  Hongzhe Li,et al.  Co-expression networks: graph properties and topological comparisons , 2010, Bioinform..

[6]  L. Klampfer,et al.  Signal transducers and activators of transcription (STATs): Novel targets of chemopreventive and chemotherapeutic drugs. , 2006, Current cancer drug targets.

[7]  Michael J. T. Stubbington,et al.  Single-cell transcriptomics to explore the immune system in health and disease , 2017, Science.

[8]  Minghua Deng,et al.  VCNet: vector‐based gene co‐expression network construction and its application to RNA‐seq data , 2017, Bioinform..

[9]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[10]  Daniel Marbach,et al.  Assessment of network module identification across complex diseases , 2019, Nature Methods.

[11]  Fritz J Sedlazeck,et al.  Piercing the dark matter: bioinformatics of long-range sequencing and mapping , 2018, Nature Reviews Genetics.

[12]  S. Ghosh,et al.  The NF-kappaB family of transcription factors and its regulation. , 2009, Cold Spring Harbor perspectives in biology.

[13]  R. Davies The distribution of a linear combination of 2 random variables , 1980 .

[14]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[15]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[16]  G. Mills,et al.  Gene Regulatory Network Perturbation by Genetic and Epigenetic Variation. , 2018, Trends in biochemical sciences.

[17]  Xiufen Zou,et al.  Characterizing and controlling the inflammatory network during influenza A virus infection , 2014, Scientific Reports.

[18]  Paolo Piazza,et al.  Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis , 2017, F1000Research.

[19]  Ana Conesa,et al.  Dynamic Gene Regulatory Networks of Human Myeloid Differentiation. , 2017, Cell systems.

[20]  Salam A. Assi,et al.  Two distinct auto-regulatory loops operate at the PU.1 locus in B cells and myeloid cells. , 2011, Blood.

[21]  Tsukasa Okuda,et al.  RUNX1/AML1: A Central Player in Hematopoiesis , 2001, International journal of hematology.

[22]  Gloria M. Sheynkman,et al.  Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing , 2016, Cell.

[23]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[24]  J. Wyatt,et al.  Induction of endogenous Bcl-xS through the control of Bcl-x pre-mRNA splicing by antisense oligonucleotides , 1999, Nature Biotechnology.

[25]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[26]  G. Ast,et al.  Alternative splicing and evolution: diversification, exon definition and function , 2010, Nature Reviews Genetics.

[27]  James C. Hu,et al.  The Gene Ontology Resource: 20 years and still GOing strong , 2019 .

[28]  Matej Oresic,et al.  Genome-wide profiling of interleukin-4 and STAT6 transcription factor regulation of human Th2 cell programming. , 2010, Immunity.