Isoform Sequencing and State-of-Art Applications for Unravelling Complexity of Plant Transcriptomes

Single-molecule real-time (SMRT) sequencing developed by PacBio, also called third-generation sequencing (TGS), offers longer reads than the second-generation sequencing (SGS). Given its ability to obtain full-length transcripts without assembly, isoform sequencing (Iso-Seq) of transcriptomes by PacBio is advantageous for genome annotation, identification of novel genes and isoforms, as well as the discovery of long non-coding RNA (lncRNA). In addition, Iso-Seq gives access to the direct detection of alternative splicing, alternative polyadenylation (APA), gene fusion, and DNA modifications. Such applications of Iso-Seq facilitate the understanding of gene structure, post-transcriptional regulatory networks, and subsequently proteomic diversity. In this review, we summarize its applications in plant transcriptome study, specifically pointing out challenges associated with each step in the experimental design and highlight the development of bioinformatic pipelines. We aim to provide the community with an integrative overview and a comprehensive guidance to Iso-Seq, and thus to promote its applications in plant research.

[1]  W. Wong,et al.  Improving PacBio Long Read Accuracy by Short Read Alignment , 2012, PloS one.

[2]  Chentao Lin,et al.  Comprehensive profiling of rhizome‐associated alternative splicing and alternative polyadenylation in moso bamboo (Phyllostachys edulis) , 2017, The Plant journal : for cell and molecular biology.

[3]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): gene structure and function annotation , 2007, Nucleic Acids Res..

[4]  B. Slabbinck,et al.  Plant-RRBS, a bisulfite and next-generation sequencing-based methylome profiling method enriching for coverage of cytosine positions , 2017, BMC Plant Biology.

[5]  S. Turner,et al.  Real-time DNA sequencing from single polymerase molecules. , 2010, Methods in enzymology.

[6]  Tyson A. Clark,et al.  Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing , 2015, Nucleic acids research.

[7]  Q. Li,et al.  A Polyadenylation Factor Subunit Implicated in Regulating Oxidative Signaling in Arabidopsis thaliana , 2008, PloS one.

[8]  Ali Bashir,et al.  Detecting epigenetic motifs in low coverage and metagenomics settings , 2014, BMC Bioinformatics.

[9]  Guoli Ji,et al.  Genome level analysis of rice mRNA 3′-end processing signals and alternative polyadenylation , 2008, Nucleic acids research.

[10]  John A. Hamilton,et al.  The TIGR Rice Genome Annotation Resource: improvements and new features , 2006, Nucleic Acids Res..

[11]  Joanna M. Cross,et al.  Genetic Approaches to Study Plant Responses to Environmental Stresses: An Overview , 2016, Biology.

[12]  Bo Zhang,et al.  CASH: a constructing comprehensive splice site method for detecting alternative splicing events , 2018, Briefings Bioinform..

[13]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[14]  R. Elkon,et al.  Alternative cleavage and polyadenylation: extent, regulation and function , 2013, Nature Reviews Genetics.

[15]  Jacob A. Tennessen,et al.  Evolutionary Origins and Dynamics of Octoploid Strawberry Subgenomes Revealed by Dense Targeted Capture Linkage Maps , 2014, Genome biology and evolution.

[16]  A. Reddy Alternative splicing of pre-messenger RNAs in plants in the genomic era. , 2007, Annual review of plant biology.

[17]  Aimin Li,et al.  PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme , 2014, BMC Bioinformatics.

[18]  Faye D. Schilkey,et al.  A survey of the sorghum transcriptome using single-molecule long reads , 2016, Nature Communications.

[19]  Leena Salmela,et al.  LoRDEC: accurate and efficient long read error correction , 2014, Bioinform..

[20]  V. Brendel,et al.  Genomewide comparative analysis of alternative splicing in plants. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[22]  M. Sammeth,et al.  Analysis of alternative splicing events in custom gene datasets by AStalavista. , 2015, Methods in molecular biology.

[23]  James C. Schnable,et al.  A Comprehensive Analysis of Alternative Splicing in Paleopolyploid Maize , 2017, Front. Plant Sci..

[24]  Nam V. Hoang,et al.  A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing , 2017, BMC Genomics.

[25]  C. Lister,et al.  Targeted 3′ Processing of Antisense Transcripts Triggers Arabidopsis FLC Chromatin Silencing , 2010, Science.

[26]  Denghui Xing,et al.  Alternative polyadenylation and gene expression regulation in plants , 2011, Wiley interdisciplinary reviews. RNA.

[27]  M. Gonzalez-Garay Introduction to Isoform Sequencing Using Pacific Biosciences Technology (Iso-Seq) , 2016 .

[28]  Hongfang Liu,et al.  Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research , 2015, BMC Genomics.

[29]  Lennart Martens,et al.  SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification , 2017, bioRxiv.

[30]  Thomas Hackl,et al.  proovread: large-scale high-accuracy PacBio correction through iterative short read consensus , 2014, Bioinform..

[31]  Guoli Ji,et al.  Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation , 2011, Proceedings of the National Academy of Sciences.

[32]  Qin Li,et al.  Single-nucleotide resolution mapping of the Gossypium raimondii transcriptome reveals a new mechanism for alternative splicing of introns. , 2014, Molecular plant.

[33]  S. Turner,et al.  A flexible and efficient template format for circular consensus sequencing and SNP detection , 2010, Nucleic acids research.

[34]  Bernd Weisshaar,et al.  Exploiting single-molecule transcript sequencing for eukaryotic gene prediction , 2015, Genome Biology.

[35]  Tyson A. Clark,et al.  Direct detection of DNA methylation during single-molecule, real-time sequencing , 2010, Nature Methods.

[36]  Kin-Fan Au,et al.  PacBio Sequencing and Its Applications , 2015, Genom. Proteom. Bioinform..

[37]  Xiandong Meng,et al.  Widespread Polycistronic Transcripts in Fungi Revealed by Single-Molecule mRNA Sequencing , 2015, PloS one.

[38]  Sylvain Foissac,et al.  ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets , 2007, Nucleic Acids Res..

[39]  B. Meyers,et al.  Transcriptome dynamics through alternative polyadenylation in developmental and environmental responses in plants revealed by deep sequencing. , 2011, Genome research.

[40]  Qian Wang,et al.  GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data , 2012, Bioinform..

[41]  A. Ben-Hur,et al.  METHOD Open Access , 2014 .

[42]  W Brad Barbazuk,et al.  Detecting alternatively spliced transcript isoforms from single‐molecule long‐read sequences without a reference genome , 2017, Molecular ecology resources.

[43]  Danelle K. Seymour,et al.  The causes and consequences of DNA methylome variation in plants. , 2017, Current opinion in plant biology.

[44]  Tyson A. Clark,et al.  Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing , 2016, Nature Communications.

[45]  Lennart Martens,et al.  1 SQANTI : extensive characterization of long read transcript sequences for quality control in 1 full-length transcriptome identification and quantification 2 3 , 2017 .

[46]  A. Furtado,et al.  Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts , 2017, GigaScience.

[47]  Ute Roessner,et al.  The genome of Chenopodium quinoa , 2017, Nature.

[48]  Maojun Wang,et al.  A global survey of alternative splicing in allopolyploid cotton: landscape, complexity and regulation. , 2018, The New phytologist.

[49]  Zhongchi Liu,et al.  Global identification of alternative splicing via comparative analysis of SMRT‐ and Illumina‐based RNA‐seq in strawberry , 2017, The Plant journal : for cell and molecular biology.

[50]  Songnian Hu,et al.  Full-length transcriptome sequences and splice variants obtained by a combination of sequencing platforms applied to different root tissues of Salvia miltiorrhiza and tanshinone biosynthesis. , 2015, The Plant journal : for cell and molecular biology.

[51]  M. Schatz,et al.  Hybrid error correction and de novo assembly of single-molecule sequencing reads , 2012, Nature Biotechnology.

[52]  Suo-min Wang,et al.  SOS1, HKT1;5, and NHX1 Synergistically Modulate Na+ Homeostasis in the Halophytic Grass Puccinellia tenuiflora , 2017, Front. Plant Sci..

[53]  Y. Ruan,et al.  The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres , 2015, Scientific Reports.