Examining De Novo Transcriptome Assemblies via a Quality Assessment Pipeline

New de novo transcriptome assembly and annotation methods provide an incredible opportunity to study the transcriptome of organisms that lack an assembled and annotated genome. There are currently a number of de novo transcriptome assembly methods, but it has been difficult to evaluate the quality of these assemblies. In order to assess the quality of the transcriptome assemblies, we composed a workflow of multiple quality check measurements that in combination provide a clear evaluation of the assembly performance. We presented novel transcriptome assemblies and functional annotations for Pacific Whiteleg Shrimp (Litopenaeus vannamei ), a mariculture species with great national and international interest, and no solid transcriptome/genome reference. We examined Pacific Whiteleg transcriptome assemblies via multiple metrics, and provide an improved gene annotation. Our investigations show that assessing the quality of an assembly purely based on the assembler's statistical measurements can be misleading; we propose a hybrid approach that consists of statistical quality checks and further biological-based evaluations.

[1]  S. Weng,et al.  Identification and functional study of a shrimp Relish homologue. , 2009, Fish & shellfish immunology.

[2]  M. Blaxter,et al.  Comparing de novo assemblers for 454 transcriptome data , 2010, BMC Genomics.

[3]  E. Bornberg-Bauer,et al.  Evaluating Characteristics of De Novo Assembly Software on 454 Transcriptome Data: A Simulation Approach , 2012, PloS one.

[4]  Todd H. Oakley,et al.  The Ecoresponsive Genome of Daphnia pulex , 2011, Science.

[5]  Alejandro Sanchez-Flores,et al.  Novel transcriptome assembly and improved annotation of the whiteleg shrimp (Litopenaeus vannamei), a dominant crustacean in global seafood mariculture , 2014, Scientific Reports.

[6]  Keith Bradnam,et al.  CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes , 2007, Bioinform..

[7]  Jing Wang,et al.  An immune deficiency homolog from the white shrimp, Litopenaeus vannamei, activates antimicrobial peptide genes. , 2009, Molecular immunology.

[8]  Zhong Wang,et al.  Next-generation transcriptome assembly , 2011, Nature Reviews Genetics.

[9]  Paul S. Gross,et al.  Double-Stranded RNA Induces Sequence-Specific Antiviral Silencing in Addition to Nonspecific Immunity in a Marine Shrimp: Convergence of RNAInterference and Innate Immunity in the Invertebrate Antiviral Response? , 2005, Journal of Virology.

[10]  S. Weng,et al.  A novel prophenoloxidase 2 exists in shrimp hemocytes. , 2009, Developmental and comparative immunology.

[11]  M. Boguski,et al.  dbEST — database for “expressed sequence tags” , 1993, Nature Genetics.

[12]  Zhanjiang Liu,et al.  RNA-Seq analysis reveals genes associated with resistance to Taura syndrome virus (TSV) in the Pacific white shrimp Litopenaeus vannamei. , 2013, Developmental and comparative immunology.

[13]  TieLiu Shi,et al.  Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq , 2013, Science China Life Sciences.

[14]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[15]  Matthew J. Huentelman,et al.  IDENTIFICATION OF GENETIC VARIANTS USING BARCODED MULTIPLEXED SEQUENCING , 2008, Nature Methods.

[16]  Siu-Ming Yiu,et al.  T-IDBA: A de novo Iterative de Bruijn Graph Assembler for Transcriptome - (Extended Abstract) , 2011, RECOMB.

[17]  An-Li Wang,et al.  Trascriptome analysis of the Pacific white shrimp Litopenaeus vannamei exposed to nitrite by RNA-seq. , 2013, Fish & shellfish immunology.

[18]  Stephen A. Smith,et al.  Optimizing de novo assembly of short-read RNA-seq data for phylogenomics , 2013, BMC Genomics.

[19]  Björn Rotter,et al.  Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance , 2011, BMC Genomics.

[20]  Martin Vingron,et al.  Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels , 2012, Bioinform..

[21]  Akhilesh K. Tyagi,et al.  De Novo Assembly of Chickpea Transcriptome Using Short Reads for Gene Discovery and Marker Identification , 2011, DNA research : an international journal for rapid publication of reports on genes and genomes.

[22]  Jan-Ming Ho,et al.  A Review of the Major Penaeid Shrimp EST Studies and the Construction of a Shrimp Transcriptome Database Based on the ESTs from Four Penaeid Shrimp , 2011, Marine Biotechnology.

[23]  Teresa J. Crease,et al.  Phylogenetic evidence for a single long-lived clade of crustacean cyclic parthenogens and its implications for the evolution of sex , 1999, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[24]  Robert Palermo,et al.  Enabling large‐scale next‐generation sequence assembly with Blacklight , 2014, Concurr. Comput. Pract. Exp..

[25]  M. Flajnik,et al.  Four primordial immunoglobulin light chain isotypes, including lambda and kappa, identified in the most primitive living jawed vertebrates. , 2007, European journal of immunology.

[26]  R. Marsh,et al.  Comparative analysis of de novo transcriptome assembly , 2013, Science China Life Sciences.

[27]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[28]  G. Sherlock,et al.  Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads , 2010, BMC Genomics.

[29]  Yongmei Li,et al.  Transcriptome Analysis of Litopenaeus vannamei in Response to White Spot Syndrome Virus Infection , 2013, PloS one.

[30]  R. Sotelo-Mundo,et al.  Transcriptome analysis of gills from the white shrimp Litopenaeus vannamei infected with White Spot Syndrome Virus. , 2007, Fish & shellfish immunology.

[31]  Thomas L. Madden,et al.  BLAST: at the core of a powerful and diverse set of sequence analysis tools , 2004, Nucleic Acids Res..

[32]  M. Flajnik,et al.  Four primordial immunoglobulin light chain isotypes, including λ and κ, identified in the most primitive living jawed vertebrates , 2007 .

[33]  Yang Yu,et al.  Comparative Transcriptomic Characterization of the Early Development in Pacific White Shrimp Litopenaeus vannamei , 2014, PloS one.

[34]  Xuan Li,et al.  Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study , 2011, BMC Bioinformatics.

[35]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[36]  S. Weng,et al.  Characterization of a prophenoloxidase from hemocytes of the shrimp Litopenaeus vannamei that is down-regulated by white spot syndrome virus. , 2008, Fish & shellfish immunology.

[37]  Robert W Chapman,et al.  The transcriptomic response to viral infection of two strains of shrimp (Litopenaeus vannamei)☆ , 2010, Developmental & Comparative Immunology.

[38]  Wei Yu Chen,et al.  Wssv Infection Activates Stat in Shrimp , 2008 .

[39]  Marcel H. Schulz,et al.  A Global View of Gene Activity and Alternative Splicing by Deep Sequencing of the Human Transcriptome , 2008, Science.

[40]  T. Wetter,et al.  Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. , 2004, Genome research.

[41]  Donald V. Lightner,et al.  The Penaeid Shrimp Viral Pandemics due to IHHNV , WSSV , TSV and YHV : History in the Americas and Current Status , 2005 .

[42]  Q. Jin,et al.  Evaluating de Bruijn Graph Assemblers on 454 Transcriptomic Data , 2012, PLoS ONE.

[43]  J. Montoya-Burgos,et al.  Optimization of de novo transcriptome assembly from next-generation sequencing data. , 2010, Genome research.

[44]  Paul S Gross,et al.  Analysis of multiple tissue-specific cDNA libraries from the Pacific whiteleg shrimp, Litopenaeus vannamei. , 2006, Integrative and comparative biology.

[45]  J. Kwang,et al.  Oral Vaccination of Baculovirus-Expressed VP28 Displays Enhanced Protection against White Spot Syndrome Virus in Penaeus monodon , 2011, PloS one.

[46]  Yongsheng Bai,et al.  Evaluation of de novo transcriptome assemblies from RNA-Seq data , 2014, Genome Biology.

[47]  S. Weng,et al.  A Toll receptor in shrimp. , 2007, Molecular immunology.

[48]  S. Weng,et al.  Identification and functional study of a shrimp Dorsal homologue. , 2010, Developmental and comparative immunology.

[49]  Robert W Chapman,et al.  Insights into the immune transcriptome of the shrimp Litopenaeus vannamei: tissue-specific expression profiles and transcriptomic responses to immune challenge. , 2007, Physiological genomics.

[50]  J. Kitzman,et al.  Personalized Copy-Number and Segmental Duplication Maps using Next-Generation Sequencing , 2009, Nature Genetics.

[51]  C. Kuo,et al.  Molecular cloning and characterisation of prophenoloxidase from haemocytes of the white shrimp, Litopenaeus vannamei. , 2005, Fish & shellfish immunology.

[52]  Marie-France Sagot,et al.  Theme: Computational Biology and Bioinformatics Computational Sciences for Biology, Medicine and the Environment , 2012 .

[53]  Philippe Bardou,et al.  jvenn: an interactive Venn diagram viewer , 2014, BMC Bioinformatics.

[54]  Michael F. Criscitiello,et al.  Evolutionarily Conserved TCR Binding Sites, Identification of T Cells in Primary Lymphoid Tissues, and Surprising Trans-Rearrangements in Nurse Shark , 2010, The Journal of Immunology.

[55]  Yongmei Li,et al.  Transcriptome Analysis of Pacific White Shrimp (Litopenaeus vannamei) Hepatopancreas in Response to Taura Syndrome Virus (TSV) Experimental Infection , 2013, PloS one.

[56]  Yuko Ohta,et al.  Shark class II invariant chain reveals ancient conserved relationships with cathepsins and MHC class II. , 2012, Developmental and comparative immunology.

[57]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[58]  Yuan Zhang,et al.  A Scalable and Accurate Targeted Gene Assembly Tool (SAT-Assembler) for Next-Generation Sequencing Data , 2014, PLoS Comput. Biol..

[59]  Cheng-Yan Kao,et al.  EBARDenovo: highly accurate de novo assembly of RNA-Seq with efficient chimera-detection , 2013, Bioinform..

[60]  Yang Yu,et al.  SNP Discovery in the Transcriptome of White Pacific Shrimp Litopenaeus vannamei by Next Generation Sequencing , 2014, PloS one.

[61]  Scott J Emrich,et al.  Assessing De Novo transcriptome assembly metrics for consistency and utility , 2013, BMC Genomics.

[62]  Jianguo He,et al.  Analysis of Litopenaeus vannamei Transcriptome Using the Next-Generation DNA Sequencing Technique , 2012, PloS one.