Local Coexpression Domains of Two to Four Genes in the Genome of Arabidopsis1[w]

Expression of genes in eukaryotic genomes is known to cluster, but cluster size is generally loosely defined and highly variable. We have here taken a very strict definition of cluster as sets of physically adjacent genes that are highly coexpressed and form so-called local coexpression domains. The Arabidopsis (Arabidopsis thaliana) genome was analyzed for the presence of such local coexpression domains to elucidate its functional characteristics. We used expression data sets that cover different experimental conditions, organs, tissues, and cells from the Massively Parallel Signature Sequencing repository and microarray data (Affymetrix) from a detailed root analysis. With these expression data, we identified 689 and 1,481 local coexpression domains, respectively, consisting of two to four genes with a pairwise Pearson's correlation coefficient larger than 0.7. This number is approximately 1- to 5-fold higher than the numbers expected by chance. A small (5%–10%) yet significant fraction of genes in the Arabidopsis genome is therefore organized into local coexpression domains. These local coexpression domains were distributed over the genome. Genes in such local domains were for the major part not categorized in the same functional category (GOslim). Neither tandemly duplicated genes nor shared promoter sequence nor gene distance explained the occurrence of coexpression of genes in such chromosomal domains. This indicates that other parameters in genes or gene positions are important to establish coexpression in local domains of Arabidopsis chromosomes.

[1]  R. Lyman Ott.,et al.  An introduction to statistical methods and data analysis , 1977 .

[2]  W. J. Stiekema,et al.  Reduced Position Effect in Mature Transgenic Plants Conferred by the Chicken Lysozyme Matrix-Associated Region. , 1994, The Plant cell.

[3]  W. J. Stiekema,et al.  The MAR-Mediated Reduction in Position Effect Can Be Uncoupled from Copy Number-Dependent Expression in Transgenic Plants. , 1995, The Plant cell.

[4]  T. Boulikas,et al.  Chromatin domains and prediction of MAR sequences. , 1995, International review of cytology.

[5]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[6]  E. Khavkin,et al.  Mapped genomic locations for developmental functions and QTLs reflect concerted groups in maize (Zea mays L.) , 1997, Theoretical and Applied Genetics.

[7]  G. Church,et al.  A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression , 2000, Nature Genetics.

[8]  F. Baas,et al.  The Human Transcriptome Map: Clustering of Highly Expressed Genes in Chromosomal Domains , 2001, Science.

[9]  A. West,et al.  Insulators and boundaries: versatile regulatory elements in the eukaryotic genome. , 2001, Science.

[10]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[11]  Roland Arnold,et al.  MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource based on the first complete plant genome , 2002, Nucleic Acids Res..

[12]  Lindsay I. Smith,et al.  A tutorial on Principal Components Analysis , 2002 .

[13]  Martin J. Lercher,et al.  Clustering of housekeeping genes provides a unified model of gene order in the human genome , 2002, Nature Genetics.

[14]  Jian Zhang,et al.  The Protein Information Resource: an integrated public resource of functional annotation of proteins , 2002, Nucleic Acids Res..

[15]  J. Nap,et al.  Assembly of two transgenes in an artificial chromatin domain gives highly coordinated expression in tobacco. , 2002, Genetics.

[16]  Joshua M. Stuart,et al.  Chromosomal clustering of muscle-expressed genes in Caenorhabditis elegans , 2002, Nature.

[17]  Gerald M Rubin,et al.  Evidence for large domains of similarly expressed genes in the Drosophila genome , 2002, Journal of biology.

[18]  Thomas Blumenthal,et al.  Coexpression of neighboring genes in Caenorhabditis elegans is mostly due to operons and duplicate genes. , 2003, Genome research.

[19]  T. Zhu Global analysis of gene expression using GeneChip microarrays. , 2003, Current opinion in plant biology.

[20]  I. Kohane,et al.  Inter-species differences of co-expression of neighboring genes in eukaryotic genomes , 2004, BMC Genomics.

[21]  H. Bussemaker,et al.  The human transcriptome map reveals extremes in gene density, intron length, GC content, and repeat pattern for domains of highly and weakly expressed genes. , 2003, Genome research.

[22]  Jungwon Yoon,et al.  The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community , 2003, Nucleic Acids Res..

[23]  David Botstein,et al.  The Stanford Microarray Database: data access and quality assessment tools , 2003, Nucleic Acids Res..

[24]  D. Shasha,et al.  A Gene Expression Map of the Arabidopsis Root , 2003, Science.

[25]  S. Rhee,et al.  Functional Annotation of the Arabidopsis Genome Using Controlled Vocabularies1 , 2004, Plant Physiology.

[26]  P. Zimmermann,et al.  GENEVESTIGATOR. Arabidopsis Microarray Database and Analysis Toolbox1[w] , 2004, Plant Physiology.

[27]  C. Pál,et al.  The evolutionary dynamics of eukaryotic gene order , 2004, Nature Reviews Genetics.

[28]  Yasukazu Nakamura,et al.  Distinctive features of plant organs characterized by global analysis of gene expression in Arabidopsis. , 2004, DNA research : an international journal for rapid publication of reports on genes and genomes.

[29]  Guillaume Blanc,et al.  Functional Divergence of Duplicated Genes Formed by Polyploidy during Arabidopsis Evolution , 2004, The Plant Cell Online.

[30]  E. J. Williams,et al.  Coexpression of neighboring genes in the genome of Arabidopsis thaliana. , 2004, Genome research.

[31]  Homin K. Lee,et al.  Coexpression analysis of human genes across many microarray data sets. , 2004, Genome research.

[32]  Marta Matvienko,et al.  Arabidopsis MPSS. An Online Resource for Quantitative Expression Analysis1[w] , 2004, Plant Physiology.