Computational Analysis of Constraints on Noncoding Regions, Coding Regions and Gene Expression in Relation to Plasmodium Phenotypic Diversity

Background Malaria-causing Plasmodium species exhibit marked differences including host choice and preference for invading particular cell types. The genetic bases of phenotypic differences between parasites can be understood, in part, by investigating constraints on gene expression and genic sequences, both coding and regulatory. Methodology/Principal Findings We investigated the evolutionary constraints on sequence and expression of parasitic genes by applying comparative genomics approaches to 6 Plasmodium genomes and 2 genome-wide expression studies. We found that the coding regions of Plasmodium transcription factor and sexual development genes are relatively less constrained, as are those of genes encoding CCCH zinc fingers and invasion proteins, which all play important roles in these parasites. Transcription factors and genes with stage-restricted expression have conserved upstream regions and so do several gene classes critical to the parasite's lifestyle, namely, ion transport, invasion, chromatin assembly and CCCH zinc fingers. Additionally, a cross-species comparison of expression patterns revealed that Plasmodium-specific genes exhibit significant expression divergence. Conclusions/Significance Overall, constraints on Plasmodium's protein coding regions confirm observations from other eukaryotes in that transcription factors are under relatively lower constraint. Proteins relevant to the parasite's unique lifestyle also have lower constraint on their coding regions. Greater conservation between Plasmodium species in terms of promoter motifs suggests tight regulatory control of lifestyle genes. However, an interspecies divergence in expression patterns of these genes suggests that either expression is controlled via genomic or epigenomic features not encoded in the proximal promoter sequence, or alternatively, the combinatorial interactions between motifs confer species-specific expression patterns.

[1]  Manuel Llinás,et al.  Comparative whole genome transcriptome analysis of three Plasmodium falciparum strains , 2006, Nucleic acids research.

[2]  Chuong B. Do,et al.  Access the most recent version at doi: 10.1101/gr.926603 References , 2003 .

[3]  Andrew R. Gehrke,et al.  Specific DNA-binding by Apicomplexan AP2 transcription factors , 2008, Proceedings of the National Academy of Sciences.

[4]  Sven Bergmann,et al.  Rewiring of the Yeast Transcriptional Network Through the Evolution of Motif Usage , 2005, Science.

[5]  John R Yates,et al.  A Comprehensive Survey of the Plasmodium Life Cycle by Genomic, Transcriptomic, and Proteomic Analyses , 2005, Science.

[6]  S. Horvath,et al.  Conservation and evolution of gene coexpression networks in human and chimpanzee brains , 2006, Proceedings of the National Academy of Sciences.

[7]  Subhajyoti De,et al.  Functional protein divergence in the evolution of Homo sapiens , 2008, Genome Biology.

[8]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[9]  Terence P. Speed,et al.  Expression profiling in primates reveals a rapid evolution of human transcription factors , 2006, Nature.

[10]  M Lanzer,et al.  A sequence element associated with the Plasmodium falciparum KAHRP gene is the site of developmentally regulated protein-DNA interactions. , 1992, Nucleic acids research.

[11]  Joseph L DeRisi,et al.  Whole-genome analysis of mRNA decay in Plasmodium falciparum reveals a global lengthening of mRNA half-life during the intra-erythrocytic development cycle , 2007, Genome Biology.

[12]  T. Gojobori,et al.  Highly conserved upstream sequences for transcription factor genes and implications for the regulatory network. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[14]  Jonathan E. Allen,et al.  Genome sequence and comparative analysis of the model rodent malaria parasite Plasmodium yoelii yoelii , 2002, Nature.

[15]  S. Kasif,et al.  Genes involved in complex adaptive processes tend to have highly conserved upstream regions in mammalian genomes , 2005, BMC Genomics.

[16]  D. Wirth,et al.  Linker scanning mutagenesis of the Plasmodium gallinaceum sexual stage specific gene pgs28 reveals a novel downstream cis-control element. , 2003, Molecular and biochemical parasitology.

[17]  Pauline Ward,et al.  Protein kinases of the human malaria parasite Plasmodium falciparum: the kinome of a divergent eukaryote , 2004, BMC Genomics.

[18]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[19]  Kaisheng Chen,et al.  In silico gene function prediction using ontology-based pattern identification , 2005, Bioinform..

[20]  M. King,et al.  Evolution at two levels in humans and chimpanzees. , 1975, Science.

[21]  J. Derisi,et al.  The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparum , 2003, PLoS biology.

[22]  M. Huynen,et al.  Combinatorial gene regulation in Plasmodium falciparum. , 2006, Trends in genetics : TIG.

[23]  Kyle T. Siebenthall,et al.  Genome variation and evolution of the malaria parasite Plasmodium falciparum , 2007, Nature Genetics.

[24]  Yingyao Zhou,et al.  Global analysis of transcript and protein levels across the Plasmodium falciparum life cycle. , 2004, Genome research.

[25]  S. Hannenhalli,et al.  Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation , 2007, Nucleic acids research.

[26]  Manoj T. Duraisingh,et al.  Heterochromatin Silencing and Locus Repositioning Linked to Regulation of Virulence Genes in Plasmodium falciparum , 2005, Cell.

[27]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[28]  Kiaran Kirk,et al.  The 'permeome' of the malaria parasite: an overview of the membrane transport proteins of Plasmodium falciparum , 2005, Genome Biology.

[29]  Jonathan E. Allen,et al.  Genome sequence of the human malaria parasite Plasmodium falciparum , 2002, Nature.

[30]  J. Schug,et al.  The Plasmodium genome database , 2002, Nature.

[31]  C. Ouzounis,et al.  Comparative genomics of transcriptional control in the human malaria parasite Plasmodium falciparum. , 2004, Genome research.

[32]  Naama Barkai,et al.  On the relation between promoter divergence and gene expression evolution , 2008, Molecular systems biology.

[33]  Yingyao Zhou,et al.  Evidence-Based Annotation of the Malaria Parasite's Genome Using Comparative Expression Profiling , 2008, PloS one.

[34]  S. Bergmann,et al.  Comparative Gene Expression Analysis by a Differential Clustering Approach: Application to the Candida albicans Transcription Program , 2005, PLoS genetics.

[35]  Feng Chen,et al.  OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups , 2005, Nucleic Acids Res..

[36]  Sridhar Hannenhalli,et al.  Identification of transcription factor binding sites in the human genome sequence , 2002, Mammalian Genome.

[37]  L. Mularoni,et al.  Housekeeping genes tend to show reduced upstream sequence conservation , 2007, Genome Biology.

[38]  Ryan D. Hernandez,et al.  Natural selection on protein-coding genes in the human genome , 2005, Nature.

[39]  M. Madan Babu,et al.  Discovery of the principal specific transcription factors of Apicomplexa and their implication for the evolution of the AP2-integrase DNA binding domains , 2005, Nucleic acids research.