Statistical analysis of gene expression profiles

Function divergence after gene duplication has been considered to be an important mechanism for the evolution of new functions. Although gene expression profiles have been treated as an important indicator of gene function, large scale gene expression analysis has mostly focused on current relationships among genes, instead of their evolutionary relationships. By putting expression analysis into the framework of evolution, we make inferences about expression divergence after gene duplication. Based on the Brownian-based model (Gu 2004), the posterior distribution of the ancestral expression profiles are shown to follow a multivariate-normal distribution. This approach provides not only the estimates of the ancestral expression profiles, but also provides a measure of the precision of the estimation/predict!on, thereby, filtering significant information from the background noise of the data. Introduction Ancestral state reconstruction within an evolutionary tree is at the center of comparative studies in evolutionary biology. At the morphological level, comparative analysis looking for evidence of correlated change in two characters may need the help of inferred ancestral states (Harvey and Pagel, 1991). At the molecular level, the properties of ancient molecules (Malcolm et al., 1990; Stackhouse et al., 1990; Adey et al., 1994; Jermann et al., 1995) can be examined through the inferred ancestral amino acid sequences, which can be further tested in vivo or in vitro. Massive microarray expression profiles make it possible to reconstruct the ancestral expression pattern. In the course of such endeavor, appropriate methodology for ancestral state inference is essential.

[1]  Steven A. Benner,et al.  Reconstructing the evolutionary history of the artiodactyl ribonuclease superfamily , 1995, Nature.

[2]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[3]  M. Pagel The Maximum Likelihood Approach to Reconstructing Ancestral Character States of Discrete Characters on Phylogenies , 1999 .

[4]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[5]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[6]  Nick V. Grishin,et al.  Estimation of the number of amino acid substitutions per site when the substitution rate varies among sites , 1995, Journal of Molecular Evolution.

[7]  M. Nei,et al.  A new method of inference of ancestral nucleotide and amino acid sequences. , 1995, Genetics.

[8]  X. Gu,et al.  Induced gene expression in human brain after the split from chimpanzee. , 2003, Trends in genetics : TIG.

[9]  W. Maddison Squared-Change Parsimony Reconstructions of Ancestral States for Continuous-Valued Characters on a Phylogenetic Tree , 1991 .

[10]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[11]  Michael B. Eisen,et al.  Identification of regulatory elements using a feature selection method , 2002, Bioinform..

[12]  Jun S. Liu,et al.  An algorithm for finding protein–DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments , 2002, Nature Biotechnology.

[13]  Brian W. Matthews,et al.  Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing , 1990, Nature.

[14]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[15]  M. Nei,et al.  Molecular Evolution and Phylogenetics , 2000 .

[16]  K. Kidd,et al.  Phylogenetic analysis: concepts and methods. , 1971, American journal of human genetics.

[17]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[18]  Matthew W. Hahn,et al.  The evolution of transcriptional regulation in eukaryotes. , 2003, Molecular biology and evolution.

[19]  A. Wagner,et al.  Decoupled evolution of coding region and mRNA expression patterns after gene duplication: implications for the neutralist-selectionist debate. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[20]  D. Penny The comparative method in evolutionary biology , 1992 .

[21]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[22]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[23]  G. Church,et al.  Identifying regulatory networks by combinatorial analysis of promoter elements , 2001, Nature Genetics.

[24]  Michael Q. Zhang,et al.  Large-scale human promoter mapping using CpG islands , 2000, Nature Genetics.

[25]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[26]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[27]  L. Cavalli-Sforza,et al.  PHYLOGENETIC ANALYSIS: MODELS AND ESTIMATION PROCEDURES , 1967, Evolution; international journal of organic evolution.

[28]  Balázs Papp,et al.  Evolution of cis-regulatory elements in duplicated genes of yeast. , 2003, Trends in genetics : TIG.

[29]  R. Huey,et al.  PHYLOGENETIC STUDIES OF COADAPTATION: PREFERRED TEMPERATURES VERSUS OPTIMAL PERFORMANCE TEMPERATURES OF LIZARDS , 1987, Evolution; international journal of organic evolution.

[30]  Masatoshi Nei,et al.  The number of nucleotides required to determine the branching order of three species, with special reference to the human-chimpanzee-gorilla divergence , 2005, Journal of Molecular Evolution.

[31]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[32]  Scott A. Rifkin,et al.  Evolution of gene expression in the Drosophila melanogaster subgroup , 2003, Nature Genetics.

[33]  M V Ruvolo,et al.  Hybridization cross-reactivity within homologous gene families on glass cDNA microarrays. , 2001, BioTechniques.

[34]  Kathleen Marchal,et al.  A Gibbs sampling method to detect over-represented motifs in the upstream regions of co-expressed genes , 2001, RECOMB.

[35]  D. Botstein,et al.  Genome-wide characterization of the Zap1p zinc-responsive regulon in yeast. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Gary D. Stormo,et al.  Identifying target sites for cooperatively binding factors , 2001, Bioinform..

[37]  T. Garland,et al.  PHYLOGENETIC ANALYSES OF THE CORRELATED EVOLUTION OF CONTINUOUS CHARACTERS: A SIMULATION STUDY , 1991, Evolution; international journal of organic evolution.

[38]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[39]  M. Gerstein,et al.  Genomic analysis of gene expression relationships in transcriptional regulatory networks. , 2003, Trends in genetics : TIG.

[40]  J. Felsenstein Phylogenies and quantitative characters , 1988 .

[41]  H. Bussemaker,et al.  Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Andrey Rzhetsky,et al.  Statistical properties of the ordinary least-squares, generalized least-squares, and minimum-evolution methods of phylogenetic inference , 1992, Journal of Molecular Evolution.

[43]  J. Collado-Vides,et al.  Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. , 2000, Nucleic acids research.

[44]  E. Wingender,et al.  A compilation of composite regulatory elements affecting gene transcription in vertebrates. , 1995, Nucleic acids research.

[45]  J. Felsenstein Phylogenies and the Comparative Method , 1985, The American Naturalist.

[46]  Dolph Schluter,et al.  Uncertainty in ancient phylogenies , 1995, Nature.

[47]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[48]  T. F. Hansen,et al.  Phylogenies and the Comparative Method: A General Approach to Incorporating Phylogenetic Information into the Analysis of Interspecific Data , 1997, The American Naturalist.

[49]  G. Church,et al.  A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression , 2000, Nature Genetics.

[50]  Rachel B. Brem,et al.  Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors , 2003, Nature Genetics.

[51]  J. Hartigan MINIMUM MUTATION FITS TO A GIVEN TREE , 1973 .

[52]  J. Liu,et al.  Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. , 2001, Nucleic acids research.

[53]  A. Force,et al.  Preservation of duplicate genes by complementary, degenerative mutations. , 1999, Genetics.

[54]  D. Nicolae,et al.  Rapid divergence in expression between duplicate genes inferred from microarray data. , 2002, Trends in genetics : TIG.

[55]  Olivier Gascuel,et al.  Concerning the NJ algorithm and its unweighted version, UNJ , 1996, Mathematical Hierarchies and Biology.

[56]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[57]  W. Fitch,et al.  Evidence suggesting a non-random character to nucleotide replacements in naturally occurring mutations. , 1967, Journal of molecular biology.

[58]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[59]  G. Stormo,et al.  ANN-Spec: a method for discovering transcription factor binding sites with improved specificity. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[60]  Xun Gu,et al.  Novel PAX6 binding sites in the human genome and the role of repetitive elements in the evolution of gene regulation. , 2002, Genome research.

[61]  X. Gu Statistical Framework for Phylogenomic Analysis of Gene Family Expression Profiles , 2004, Genetics.

[62]  John J. Wyrick,et al.  Genome-wide location and function of DNA binding proteins. , 2000, Science.

[63]  Alan M. Moses,et al.  Position specific variation in the rate of evolution in transcription factor binding sites , 2003, BMC Evolutionary Biology.

[64]  M. Nei,et al.  Theoretical foundation of the minimum-evolution method of phylogenetic inference. , 1993, Molecular biology and evolution.

[65]  Scott R. Presnell,et al.  The ribonuclease from an extinct bovid ruminant , 1990, FEBS letters.

[66]  P. Waddell,et al.  Rapid Evaluation of Least-Squares and Minimum-Evolution Criteria on Phylogenetic Trees , 1998 .

[67]  S. Pääbo,et al.  Intra- and Interspecific Variation in Primate Gene Expression Patterns , 2002, Science.

[68]  T. F. Hansen,et al.  TRANSLATING BETWEEN MICROEVOLUTIONARY PROCESS AND MACROEVOLUTIONARY PATTERNS: THE CORRELATION STRUCTURE OF INTERSPECIFIC DATA , 1996, Evolution; international journal of organic evolution.

[69]  Andreas Wagner,et al.  Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes , 1999, Bioinform..

[70]  Xun Gu,et al.  Age distribution of human gene families shows significant roles of both large- and small-scale duplications in vertebrate evolution , 2002, Nature Genetics.

[71]  Kathleen Marchal,et al.  A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling , 2001, Bioinform..

[72]  A. Rodrigo,et al.  Estimating the Ancestral States of a Continuous-Valued Character Using Squared-Change Parsimony: An Analytical Solution , 1994 .