Mean-field methods in evolutionary duplication-innovation-loss models for the genome-level repertoire of protein domains.

We present a combined mean-field and simulation approach to different models describing the dynamics of classes formed by elements that can appear, disappear, or copy themselves. These models, related to a paradigm duplication-innovation model known as Chinese restaurant process, are devised to reproduce the scaling behavior observed in the genome-wide repertoire of protein domains of all known species. In view of these data, we discuss the qualitative and quantitative differences of the alternative model formulations, focusing in particular on the roles of element loss and of the specificity of empirical domain classes.

[1]  Bruno Bassetti,et al.  Universal features in the genome-level evolution of protein domains , 2008, Genome Biology.

[2]  Eugene I Shakhnovich,et al.  Expanding protein universe and its origin from the biological Big Bang , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Eugene V Koonin,et al.  Evolution of genome architecture. , 2009, The international journal of biochemistry & cell biology.

[4]  I. Shmulevich,et al.  Computational and Statistical Approaches to Genomics , 2007, Springer US.

[5]  M. Huynen,et al.  The frequency distribution of gene family sizes in complete genomes. , 1998, Molecular biology and evolution.

[6]  E. Koonin,et al.  The structure of the protein universe and genome evolution , 2002, Nature.

[7]  Eugene V Koonin,et al.  Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models , 2004, BMC Evolutionary Biology.

[8]  Eugene V. Koonin,et al.  Power Laws, Scale-Free Networks and Genome Biology , 2006 .

[9]  D. Tautz,et al.  Of statistics and genomes. , 2004, Trends in genetics : TIG.

[10]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[11]  E. Nimwegen Scaling Laws in the Functional Content of Genomes , 2003, physics/0307001.

[12]  Mark Gerstein,et al.  Analytical Evolutionary Model for Protein Fold Occurrence in Genomes, Accounting for the Effects of Gene Duplication, Deletion, Acquisition and Selective Pressure , 2006 .

[13]  Uri Alon,et al.  Coding limits on the number of transcription factors , 2006, BMC Genomics.

[14]  D. Petrov,et al.  Preferential Duplication of Conserved Proteins in Eukaryotic Genomes , 2004, PLoS biology.

[15]  C. Ouzounis,et al.  The balance of driving forces during genome evolution in prokaryotes. , 2003, Genome research.

[16]  E. Rocha Evolutionary patterns in prokaryotic genomes. , 2008, Current opinion in microbiology.

[17]  C. Orengo,et al.  Protein families and their evolution-a structural perspective. , 2005, Annual review of biochemistry.

[18]  J. Mattick RNA regulation: a new genetics? , 2004, Nature Reviews Genetics.

[19]  M. Gerstein,et al.  Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. , 2001, Journal of molecular biology.

[20]  Eugene V. Koonin,et al.  Simple stochastic birth andz death models of genome evolution: was there enough time for us to evolve? , 2003, Bioinform..

[21]  Ericka Stricklin-Parker,et al.  Ann , 2005 .

[22]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[23]  Sergei Maslov,et al.  Toolbox model of evolution of prokaryotic metabolic networks and their regulation , 2009, Proceedings of the National Academy of Sciences.

[24]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[25]  S. Jonjić,et al.  Modulation of natural killer cell activity by viruses. , 2010, Current opinion in microbiology.

[26]  Rick Durrett,et al.  Power laws for family sizes in a duplication model , 2004, math/0406216.

[27]  E. Koonin,et al.  Birth and death of protein domains: A simple model of evolution explains power law behavior , 2002, BMC Evolutionary Biology.

[28]  M. Evans,et al.  Nonequilibrium statistical mechanics of the zero-range process and related models , 2005, cond-mat/0501338.

[29]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[30]  ScienceDirect Current opinion in microbiology , 1998 .

[31]  C. Pál,et al.  An integrated view of protein evolution , 2006, Nature Reviews Genetics.

[32]  Zhaohui S. Qin,et al.  Clustering microarray gene expression data using weighted Chinese restaurant process , 2006, Bioinform..

[33]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[34]  M. Lynch,et al.  The Origins of Genome Complexity , 2003, Science.

[35]  Eugene V. Koonin,et al.  Modeling genome evolution with a diffusion approximation of a birth-and-death process , 2005, Bioinform..

[36]  Sanne Abeln,et al.  Fold usage on genomes and protein fold evolution , 2005, Proteins.

[37]  Ginestra Bianconi,et al.  Statistical mechanics of the "Chinese restaurant process": lack of self-averaging, anomalous finite-size effects, and condensation. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.