A Global Assembly of Cotton ESTs

Approximately 185,000 Gossypium EST sequences comprising >94,800,000 nucleotides were amassed from 30 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including drought stress and pathogen challenges. These libraries were derived from allopolyploid cotton (Gossypium hirsutum; AT and DT genomes) as well as its two diploid progenitors,Gossypium arboreum (A genome) and Gossypium raimondii (D genome). ESTs were assembled using the Program for Assembling and Viewing ESTs (PAVE), resulting in 22,030 contigs and 29,077 singletons (51,107 unigenes). Further comparisons among the singletons and contigs led to recognition of 33,665 exemplar sequences that represent a nonredundant set of putative Gossypium genes containing partial or full-length coding regions and usually one or two UTRs. The assembly, along with their UniProt BLASTX hits, GO annotation, and Pfam analysis results, are freely accessible as a public resource for cotton genomics. Because ESTs from diploid and allotetraploid Gossypium were combined in a single assembly, we were in many cases able to bioinformatically distinguish duplicated genes in allotetraploid cotton and assign them to either the A or D genome. The assembly and associated information provide a framework for future investigation of cotton functional and evolutionary genomics.

[1]  M. Dugan,et al.  Cotton , 2009, Fashion Fibers.

[2]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[3]  W. McCombie,et al.  Differential methylation of genes and repeats in land plants. , 2005, Genome research.

[4]  Takuji Sasaki,et al.  The map-based sequence of the rice genome , 2005, Nature.

[5]  C. Wilkerson,et al.  Biotechnological improvement of cotton fibre maturity , 2005 .

[6]  Weisheng Wu,et al.  Identification and characterization of differentially expressed ESTs of Gossypium barbadense infected by Verticillium dahliae with suppression subtractive hybridization , 2005, Molecular Biology.

[7]  A. Hughes,et al.  Expression Patterns of Duplicate Genes in the Developing Root in Arabidopsis thaliana , 2005, Journal of Molecular Evolution.

[8]  Jonathan F Wendel,et al.  Organ-Specific Silencing of Duplicated Genes in a Newly Synthesized Cotton Allotetraploid , 2004, Genetics.

[9]  Patrick S. Schnable,et al.  Picky: oligo microarray design for large genomes , 2004, Bioinform..

[10]  Jessica A Schlueter,et al.  Mining EST databases to resolve evolutionary events in major crop species. , 2004, Genome.

[11]  Miftahudin,et al.  Development of an Expressed Sequence Tag (EST) Resource for Wheat (Triticum aestivum L.) , 2004, Genetics.

[12]  G. Martin,et al.  ESTs, cDNA microarrays, and gene expression profiling: tools for dissecting plant physiology and development. , 2004, The Plant journal : for cell and molecular biology.

[13]  D. Galbraith,et al.  Methods for Transcriptional Profiling in Plants. Be Fruitful and Replicate , 2004, Plant Physiology.

[14]  I. Wilson,et al.  Gene expression profile changes in cotton root and hypocotyl tissues in response to infection with Fusarium oxysporum f. sp. vasinfectum. , 2004, Molecular plant-microbe interactions : MPMI.

[15]  Michael D. Gonzales,et al.  Functional genomics of cell elongation in developing cotton fibers , 2004, Plant Molecular Biology.

[16]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[17]  Rod A Wing,et al.  A New Resource for Cereal Genomics: 22K Barley GeneChip Comes of Age1 , 2004, Plant Physiology.

[18]  G. Fincher,et al.  Members of a New Group of Chitinase-Like Genes are Expressed Preferentially in Cotton Cells with Secondary Walls , 2004, Plant Molecular Biology.

[19]  Vipin K. Rastogi,et al.  A 3347-Locus Genetic Recombination Map of Sequence-Tagged Sites Reveals Features of Genome Organization, Transmission and Evolution of Cotton (Gossypium) , 2004, Genetics.

[20]  Yongbiao Xue,et al.  Identification of GhMYB109 encoding a R2R3 MYB transcription factor that expressed specifically in fiber initials and elongating fibers of cotton (Gossypium hirsutum L.). , 2003, Biochimica et biophysica acta.

[21]  C. V. Jongeneel,et al.  Modeling sequencing errors by combining Hidden Markov models , 2003, ECCB.

[22]  Srinivas Aluru,et al.  Efficient clustering of large EST data sets on parallel computers. , 2003, Nucleic acids research.

[23]  Yuxian Zhu,et al.  Isolation and analyses of genes preferentially expressed during early cotton fiber development by subtractive PCR and cDNA array. , 2003, Nucleic acids research.

[24]  A. Paterson,et al.  Rate variation among nuclear genes and the age of polyploidy in Gossypium. , 2003, Molecular biology and evolution.

[25]  Jonathan F. Wendel,et al.  Genes duplicated by polyploidy show unequal contributions to the transcriptome and organ-specific reciprocal silencing , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  J. Wendel,et al.  Evolution and expression of MYB genes in diploid and polyploid cotton , 2003, Plant Molecular Biology.

[27]  G. Pertea,et al.  Comparative Analyses of Potato Expressed Sequence Tag Libraries1 , 2003, Plant Physiology.

[28]  Jian-wei Liu,et al.  Molecular Characterization of the Cotton GhTUB1 Gene That Is Preferentially Expressed in Fiber1 , 2002, Plant Physiology.

[29]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[30]  D. Llewellyn,et al.  A quick and easy method for isolating good-quality RNA from cotton (Gossypium hirsutum L.) tissues , 2002, Plant Molecular Biology Reporter.

[31]  R. Van der Hoeven,et al.  Identification, Analysis, and Utilization of Conserved Ortholog Set Markers for Comparative Genomics in Higher Plants Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.010479. , 2002, The Plant Cell Online.

[32]  J. Wendel,et al.  Differential evolutionary dynamics of duplicated paralogous Adh loci in allotetraploid cotton (Gossypium). , 2002, Molecular biology and evolution.

[33]  Jin-yuan Liu,et al.  Isolation of a cotton RGP gene: a homolog of reversibly glycosylated polypeptide highly expressed during fiber development. , 2002, Biochimica et biophysica acta.

[34]  J. Wendel,et al.  Rapid diversification of the cotton genus (Gossypium: Malvaceae) revealed by analysis of sixteen nuclear and chloroplast genes. , 2002, American journal of botany.

[35]  G. Robinson,et al.  Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee. , 2002, Genome research.

[36]  Hee-Jin Kim,et al.  A novel expression assay system for fiber-specific promoters in developing cotton fibers , 2002, Plant Molecular Biology Reporter.

[37]  Hui-Hsien Chou,et al.  DNA sequence quality trimming and vector removal , 2001, Bioinform..

[38]  Hong Wang,et al.  Gene Expression Profiles during the Initial Phase of Salt Stress in Rice , 2001, Plant Cell.

[39]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[40]  J. Ohlrogge,et al.  Arabidopsis microarray service facilities. , 2000, Plant physiology.

[41]  D. Stekel,et al.  The comparison of gene expression from multiple cDNA libraries. , 2000, Genome research.

[42]  J. Wendel,et al.  Phylogeny, duplication, and intraspecific variation of Adh sequences in New World diploid cottons (Gossypium l., malvaceae). , 2000, Molecular phylogenetics and evolution.

[43]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[44]  J. Claverie,et al.  Large-scale statistical analyses of rice ESTs reveal correlated patterns of gene expression. , 1999, Genome research.

[45]  X. Huang,et al.  CAP3: A DNA sequence assembly program. , 1999, Genome research.

[46]  C. V. Jongeneel,et al.  ESTScan: A Program for Detecting, Evaluating, and Reconstructing Potential Coding Regions in EST Sequences , 1999, ISMB.

[47]  J. Timmis,et al.  Characterisation of a cotton gene expressed late in fibre cell elongation , 1999, Theoretical and Applied Genetics.

[48]  L. Greller,et al.  Detecting selective expression of genes and proteins. , 1999, Genome research.

[49]  K. El-Zik,et al.  D-subgenome bias of Xcm resistance genes in tetraploid Gossypium (cotton) suggests that polyploid formation has created novel avenues for evolution. , 1998, Genetics.

[50]  J. Timmis,et al.  Specific expression of an expansin gene during elongation of cotton fibres. , 1998, Biochimica et biophysica acta.

[51]  K. El-Zik,et al.  Polyploid formation created unique avenues for response to selection in Gossypium (cotton). , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[52]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[53]  A. Skovsted Cytological studies in cotton , 1935, Journal of Genetics.

[54]  Trung B. Nguyen,et al.  QTL Analysis of Cotton Fiber Quality Using Multiple Gossypium hirsutum × Gossypium barbadense Backcross Generations , 2005 .

[55]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[56]  J. Laroche,et al.  Large-scale statistical analysis of secondary xylem ESTs in pine , 2004, Plant Molecular Biology.

[57]  Emily Dimmer,et al.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..

[58]  J. Wendel,et al.  Polyploidy and the Evolutionary History of Cotton , 2003 .

[59]  Y. Yamazaki,et al.  Discrimination of homoeologous gene expression in hexaploid wheat by SNP analysis of contigs grouped from a large number of expressed sequence tags , 2003, Molecular Genetics and Genomics.

[60]  W. Michalek,et al.  EST analysis in barley defines a unigene set comprising 4,000 genes , 2002, Theoretical and Applied Genetics.

[61]  J. Claverie Computational methods for the identification of differential and coordinated gene expression. , 1999, Human molecular genetics.

[62]  E. Turcotte,et al.  Genetics, cytology and evolution of Gossypium. , 1985 .