EST analysis in barley defines a unigene set comprising 4,000 genes

Abstract We report the generation of 13,109 EST (Expressed Sequence Tag) sequences from barley as a first step towards the generation of a unigene set for this organism. Sequences were generated from three libraries encompassing 7,568 cDNA clones. Comparisons to nucleic acid and protein sequence databases enabled the assignment of putative functions to the mRNAs. The results of the searches against protein databases were parsed and built into a regularly updated database, available over the World Wide Web. The Stack_Pack clustering system has been applied to survey the level of redundancy, which was calculated to amount to 69%, thus we identified 4,000 different barley genes. To prove the usability of the results of the clustering process for further experiments, we subjected alignments with sequences similar to elongation factor 1 alpha to additional analysis. These sequences represented the largest group with identical putative functions (228 members) and clustering based on the analysis of 3´ sequences subdivided the group into five different assemblies. Alignments of the consensus sequences facilitated the development of PCR assays suitable for genetic mapping of four of the different gene-family members, which reside on chromosomes 2H, 4H and 5H, thus demonstrating the suitability of the cluster-results as a basis for in-depth analyses of barley gene families.

[1]  E. Mardis,et al.  Generation and analysis of 280,000 human expressed sequence tags. , 1996, Genome research.

[2]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[3]  J. Claverie,et al.  Large-scale statistical analyses of rice ESTs reveal correlated patterns of gene expression. , 1999, Genome research.

[4]  S. Lin,et al.  A high-density rice genetic linkage map with 2275 markers using a single F2 population. , 1998, Genetics.

[5]  Winston A Hide,et al.  A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. , 1999, Genome research.

[6]  J. Jurka,et al.  Repeats in genomic DNA: mining and meaning. , 1998, Current opinion in structural biology.

[7]  J. Ohlrogge,et al.  Microarray analysis of developing Arabidopsis seeds. , 2000, Plant physiology.

[8]  P. Epple,et al.  ESTs reveal a multigene family for plant defensins in Arabidopsis thaliana , 1997, FEBS letters.

[9]  Earl Hubbell,et al.  Genome-wide mapping with biallelic markers in Arabidopsis thaliana , 1999, Nature Genetics.

[10]  D. Sparrow,et al.  Isolation and characterization of euplasmic wheat-barley chromosome addition lines , 1981, Heredity.

[11]  Takuji Sasaki,et al.  Physical mapping of the rice genome with YAC clones , 1997, Plant Molecular Biology.

[12]  T. Sasaki,et al.  Toward cataloguing all rice genes: large-scale sequencing of randomly chosen rice cDNAs from a callus cDNA library. , 1994, The Plant journal : for cell and molecular biology.

[13]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[14]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.

[15]  G. Davis,et al.  A maize map standard with sequenced core markers, grass genome reference points and 932 expressed sequence tagged sites (ESTs) in a 1736-locus map. , 1999, Genetics.

[16]  M Raynal,et al.  Identification of members of gene families in Arabidopsis thaliana by contig construction from partial cDNA sequences: 106 genes encoding 50 cytoplasmic ribosomal proteins. , 1997, The Plant journal : for cell and molecular biology.

[17]  D. Kudrna,et al.  A molecular, isozyme and morphological map of the barley (Hordeum vulgare) genome , 1993, Theoretical and Applied Genetics.

[18]  A. Kleinhofs,et al.  Barley elongation factor 1 alpha: genomic organization, DNA sequence, and phylogenetic implications. , 1997, Genome.