De Novo Assembly of Chickpea Transcriptome Using Short Reads for Gene Discovery and Marker Identification

Chickpea ranks third among the food legume crops production in the world. However, the genomic resources available for chickpea are still very limited. In the present study, the transcriptome of chickpea was sequenced with short reads on Illumina Genome Analyzer platform. We have assessed the effect of sequence quality, various assembly parameters and assembly programs on the final assembly output. We assembled ∼107million high-quality trimmed reads using Velvet followed by Oases with optimal parameters into a non-redundant set of 53 409 transcripts (≥100 bp), representing about 28 Mb of unique transcriptome sequence. The average length of transcripts was 523 bp and N50 length of 900 bp with coverage of 25.7 rpkm (reads per kilobase per million). At the protein level, a total of 45 636 (85.5%) chickpea transcripts showed significant similarity with unigenes/predicted proteins from other legumes or sequenced plant genomes. Functional categorization revealed the conservation of genes involved in various biological processes in chickpea. In addition, we identified simple sequence repeat motifs in transcripts. The chickpea transcripts set generated here provides a resource for gene discovery and development of functional molecular markers. In addition, the strategy for de novo assembly of transcriptome data presented here will be helpful in other similar transcriptome studies.

[1]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[2]  Christopher D Town,et al.  A comprehensive resource of drought- and salinity- responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.) , 2009, BMC Genomics.

[3]  Xun Gu,et al.  Comparative analyses reveal distinct sets of lineage-specific genes within Arabidopsis thaliana , 2010, BMC Evolutionary Biology.

[4]  B. Haas,et al.  Identification and Characterization of Lineage-Specific Genes within the Poaceae1[W][OA] , 2007, Plant Physiology.

[5]  C. Molina,et al.  SuperSAGE: the drought stress-responsive transcriptome of chickpea roots , 2008, BMC Genomics.

[6]  C. Vance,et al.  Legumes: Importance and Constraints to Greater Use , 2003, Plant Physiology.

[7]  T. Joshi,et al.  Legume Transcription Factor Genes: What Makes Legumes So Special?1[W] , 2009, Plant Physiology.

[8]  T. Sakurai,et al.  Genome sequence of the palaeopolyploid soybean , 2010, Nature.

[9]  Kathryn A. VandenBosch,et al.  Computational Identification and Characterization of Novel Genes from Legumes1[w] , 2004, Plant Physiology.

[10]  P. Winter,et al.  Chickpea molecular breeding: New tools and concepts , 2006, Euphytica.

[11]  Srinivas Aluru,et al.  Parallel short sequence assembly of transcriptomes , 2009, BMC Bioinformatics.

[12]  P. Dang,et al.  Generation of expressed sequence tags (ESTs) for gene discovery and marker development in cultivated peanut , 2005 .

[13]  H. Mori,et al.  Genome Structure of the Legume, Lotus japonicus , 2008, DNA research : an international journal for rapid publication of reports on genes and genomes.

[14]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[15]  S. Jackson,et al.  Three Sequenced Legume Genomes and Many Crop Species: Rich Opportunities for Translational Genomics , 2009, Plant Physiology.

[16]  Simon Kasif,et al.  GC/AT-content spikes as genomic punctuation marks. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Andreas Graner,et al.  Genic microsatellite markers in plants: features and applications. , 2005, Trends in biotechnology.

[18]  Inanç Birol,et al.  De novo transcriptome assembly with ABySS , 2009, Bioinform..

[19]  K. Kinzler,et al.  Gene expression analysis goes digital , 2007, Nature Biotechnology.

[20]  Mark Johnston,et al.  Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics. , 2009, Molecular biology and evolution.

[21]  Alexander E Vinogradov,et al.  DNA helix: the importance of being AT-rich , 2017, Mammalian Genome.

[22]  J. Marden,et al.  Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing , 2008, Molecular ecology.

[23]  J. Crouch,et al.  Development of ESTs from chickpea roots and their use in diversity analysis of the Cicer genus , 2005, BMC Plant Biology.

[24]  N. Mantri,et al.  Transcriptional profiling of chickpea genes differentially regulated in response to high-salinity, cold and drought , 2007, BMC Genomics.

[25]  S. Chakraborty,et al.  Comparative analyses of genotype dependent expressed sequence tags and stress-responsive transcriptome of chickpea wilt illustrate predicted and unexpected genes and novel regulators of plant immunity , 2009, BMC Genomics.

[26]  Ying Wang,et al.  Development of a EST dataset and characterization of EST-SSRs in a traditional Chinese medicinal plant, Epimedium sagittatum (Sieb. Et Zucc.) Maxim , 2010, BMC Genomics.

[27]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[28]  D. Chattopadhyay,et al.  Expression of CAP2, an APETALA2-Family Transcription Factor from Chickpea, Enhances Growth and Tolerance to Dehydration and Salt Stress in Transgenic Tobacco1[W] , 2006, Plant Physiology.

[29]  R. Varshney,et al.  The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.) , 2010, BMC Plant Biology.

[30]  S. Schuster Next-generation sequencing transforms today's biology , 2008, Nature Methods.

[31]  G. Bernardi,et al.  Compositional Properties of Homologous Coding Sequences from Plants , 1998, Journal of Molecular Evolution.

[32]  R. Varshney,et al.  Genomics-assisted breeding for crop improvement. , 2005, Trends in plant science.

[33]  Mukesh Jain,et al.  Validation of internal control genes for quantitative gene expression studies in chickpea (Cicer arietinum L.). , 2010, Biochemical and biophysical research communications.

[34]  G. Bernardi,et al.  Two classes of genes in plants. , 2000, Genetics.

[35]  Akhilesh K Tyagi,et al.  Advances in cereal genomics and applications in crop breeding. , 2006, Trends in biotechnology.

[36]  D. Chattopadhyay,et al.  CIPK6, a CBL-interacting protein kinase is required for development and salt tolerance in plants. , 2009, The Plant journal : for cell and molecular biology.

[37]  M. Marra,et al.  Applications of new sequencing technologies for transcriptome analysis. , 2009, Annual review of genomics and human genetics.

[38]  D. Jain,et al.  Analysis of gene expression in response to water deficit of chickpea (Cicer arietinum L.) varieties differing in drought tolerance , 2010, BMC Plant Biology.

[39]  Chen Chen,et al.  Comparative analysis of ESTs in response to drought stress in chickpea (C. arietinum L.). , 2008, Biochemical and biophysical research communications.

[40]  B. Haas,et al.  Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology , 2006, BMC Genomics.