A minimal estimate for the gene content of the last universal common ancestor--exobiology from a terrestrial perspective.

Using an algorithm for ancestral state inference of gene content, given a large number of extant genome sequences and a phylogenetic tree, we aim to reconstruct the gene content of the last universal common ancestor (LUCA), a hypothetical life form that presumably was the progenitor of the three domains of life. The method allows for gene loss, previously found to be a major factor in shaping gene content, and thus the estimate of LUCA's gene content appears to be substantially higher than that proposed previously, with a typical number of over 1000 gene families, of which more than 90% are also functionally characterized. More precisely, when only prokaryotes are considered, the number varies between 1006 and 1189 gene families while when eukaryotes are also included, this number increases to between 1344 and 1529 families depending on the underlying phylogenetic tree. Therefore, the common belief that the hypothetical genome of LUCA should resemble those of the smallest extant genomes of obligate parasites is not supported by recent advances in computational genomics. Instead, a fairly complex genome similar to those of free-living prokaryotes, with a variety of functional capabilities including metabolic transformation, information processing, membrane/transport proteins and complex regulation, shared between the three domains of life, emerges as the most likely progenitor of life on Earth, with profound repercussions for planetary exploration and exobiology.

[1]  Carl Sagan,et al.  Endogenous production, exogenous delivery and impact-shock synthesis of organic molecules: an inventory for the origins of life , 1992, Nature.

[2]  L. Margulis Archaeal-eubacterial mergers in the origin of Eukarya: phylogenetic classification of life. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[3]  R. Baragiola,et al.  Oxygen on Ganymede: laboratory studies. , 1997, Science.

[4]  B. Snel,et al.  Genomes in flux: the evolution of archaeal and proteobacterial gene content. , 2002, Genome research.

[5]  B. Berks,et al.  TatD Is a Cytoplasmic Protein with DNase Activity , 2000, The Journal of Biological Chemistry.

[6]  O. White,et al.  Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.

[7]  Christos A. Ouzounis,et al.  GeneTRACE - Reconstruction of Gene Content of Ancestral Species , 2003, Bioinform..

[8]  L. Rothschild,et al.  Life in extreme environments , 2001, Nature.

[9]  E. Bapteste,et al.  On the conceptual difficulties in rooting the tree of life. , 2004, Trends in microbiology.

[10]  C. Ouzounis,et al.  Parallel origins of the nucleosome core and eukaryotic transcription from Archaea , 1996, Journal of Molecular Evolution.

[11]  M. Saraste,et al.  Evolution of energetic metabolism: the respiration-early hypothesis. , 1995, Trends in biochemical sciences.

[12]  C. Duve A Research Proposal on the Origin Of Life. Closing Lecture given at the ISSOL Congress in Oaxaca, Mexico, on July 4, 2002 , 2003, Origins of life and evolution of the biosphere.

[13]  C. Ouzounis,et al.  Transcription in archaea. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[14]  N. Sleep,et al.  The habitat and nature of early life , 2001, Nature.

[15]  G. Horneck Exobiology, the study of the origin, evolution and distribution of life within the context of cosmic evolution: a review. , 1995, Planetary and space science.

[16]  C. Chyba,et al.  Possible ecosystems and the search for life on Europa. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  T V Johnson,et al.  Organics and other molecules in the surfaces of Callisto and Ganymede. , 1997, Science.

[18]  Anton J. Enright,et al.  COmplete GENome Tracking (COGENT): A Flexible Data Environment for Computational Genomics , 2003, Bioinform..

[19]  B. Dujon,et al.  The genomic tree as revealed from whole proteome comparisons. , 1999, Genome research.

[20]  A. Kornberg Inorganic polyphosphate: a molecule of many functions. , 2003, Annual review of biochemistry.

[21]  Michael Y. Galperin,et al.  Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes , 2003, BMC Evolutionary Biology.

[22]  Nikos Kyrpides,et al.  Universal Protein Families and the Functional Content of the Last Universal Common Ancestor , 1999, Journal of Molecular Evolution.

[23]  R. Cavicchioli Extremophiles and the search for extraterrestrial life. , 2002, Astrobiology.

[24]  B. Snel,et al.  Genome phylogeny based on gene content , 1999, Nature Genetics.

[25]  H. Cleaves,et al.  Extremophiles may be irrelevant to the origin of life. , 2004, Astrobiology.

[26]  W. Doolittle,et al.  Tempo, mode, the progenote, and the universal root. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[27]  S. Fitz-Gibbon,et al.  Whole genome-based phylogenetic analysis of free-living microorganisms. , 1999, Nucleic acids research.

[28]  N. Glansdorff,et al.  About the last common ancestor, the universal life‐tree and lateral gene transfer: a reappraisal , 2000, Molecular microbiology.

[29]  N. Pace,et al.  The genetic core of the universal ancestor. , 2003, Genome research.

[30]  A. Joachimiak,et al.  The structure of the yrdC gene product from Escherichia coli reveals a new fold and suggests a role in RNA binding , 2000, Protein science : a publication of the Protein Society.

[31]  O. Sand,et al.  Phenotypic characterization of overexpression or deletion of the Escherichia coli crcA, cspE and crcB genes. , 2003, Microbiology.

[32]  L. Irwin,et al.  Energy cycling and hypothetical organisms in Europa's ocean. , 2002, Astrobiology.

[33]  E. A. Alekseev,et al.  A Rigorous Attempt to Verify Interstellar Glycine , 2004, astro-ph/0410335.

[34]  B. R. Tufts,et al.  Evidence for a subsurface ocean on Europa , 1998, Nature.

[35]  Michael J. Stanhope,et al.  Universal trees based on large combined protein sequence data sets , 2001, Nature Genetics.

[36]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[37]  Lazcano,et al.  The molecular search for the last common ancestor , 1999, Journal of molecular evolution.

[38]  C. Ouzounis,et al.  The balance of driving forces during genome evolution in prokaryotes. , 2003, Genome research.

[39]  E. Koonin,et al.  The rhomboids: a nearly ubiquitous family of intramembrane serine proteases that probably evolved by multiple ancient horizontal gene transfers , 2003, Genome Biology.

[40]  Doolittle Wf Phylogenetic Classification and the Universal Tree , 1999 .

[41]  H. Philippe,et al.  Ancient phylogenetic relationships. , 2002, Theoretical population biology.

[42]  Xinfu Jiao,et al.  The scavenger mRNA decapping enzyme DcpS is a member of the HIT family of pyrophosphatases , 2002, The EMBO journal.

[43]  C. Chyba,et al.  Cometary delivery of organic molecules to the early Earth. , 1990, Science.

[44]  L. Snyder THE SEARCH FOR INTERSTELLAR GLYCINE , 1997, Origins of life and evolution of the biosphere.

[45]  Eugene V. Koonin,et al.  Comparative genomics, minimal gene-sets and the last universal common ancestor , 2003, Nature Reviews Microbiology.

[46]  P. Forterre,et al.  The last universal common ancestor (LUCA), simple or complex? , 1999, The Biological bulletin.

[47]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Eugene V Koonin,et al.  Comparative genomics and evolution of proteins involved in RNA metabolism. , 2002, Nucleic acids research.

[49]  D. D. Marais,et al.  Astrobiology: exploring the origins, evolution, and distribution of life in the Universe. , 1999, Annual review of ecology and systematics.

[50]  A. Lazcano,et al.  The origin of life—did it occur at high temperatures? , 2004, Journal of Molecular Evolution.

[51]  A. Yayanos,et al.  Microbiology to 10,500 meters in the deep sea. , 1995, Annual review of microbiology.

[52]  D. Lovley,et al.  Extending the Upper Temperature Limit for Life , 2003, Science.

[53]  Anton J. Enright,et al.  Transcription-associated protein families are primarily taxon-specific , 2001, Bioinform..

[54]  G. Erauso,et al.  Hyperthermophilic life at deep-sea hydrothermal vents. , 1995, Planetary and space science.

[55]  Frederick W. Alt,et al.  DNA Repair, Genome Stability, and Aging , 2005, Cell.

[56]  N. Kyrpides,et al.  Tetratrico-peptide-repeat proteins in the archaeon Methanococcus jannaschii. , 1998, Trends in biochemical sciences.

[57]  C. Woese The universal ancestor. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[58]  Scott A. Sandford,et al.  Racemic amino acids from the ultraviolet photolysis of interstellar ice analogues , 2002, Nature.

[59]  C Ouzounis,et al.  The emergence of major cellular processes in evolution , 1996, FEBS letters.

[60]  James R. Brown,et al.  DNA Repair Systems in Archaea: Mementos from the Last Universal Common Ancestor? , 1999, Journal of Molecular Evolution.

[61]  P. Forterre,et al.  The nature of the last universal ancestor and the root of the tree of life, still open questions. , 1992, Bio Systems.

[62]  Anton J. Enright,et al.  Protein families and TRIBES in genome sequence space. , 2003, Nucleic acids research.

[63]  J. Castresana,et al.  Comparative genomics and bioenergetics. , 2001, Biochimica et biophysica acta.

[64]  S. Salzberg,et al.  Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. , 1999, Science.

[65]  M. Kamekura Diversity of extremely halophilic bacteria , 1998, Extremophiles.

[66]  A. Showman,et al.  The Galilean satellites. , 1999, Science.

[67]  D Penny,et al.  The nature of the last universal common ancestor. , 1999, Current opinion in genetics & development.

[68]  Christos A. Ouzounis,et al.  Measuring genome conservation across taxa: divided strains and united kingdoms , 2005, Nucleic acids research.

[69]  M. Ragan,et al.  Inferring Genome Trees by Using a Filter To Eliminate Phylogenetically Discordant Sequences and a Distance Matrix Based on Mean Normalized BLASTP Scores , 2002, Journal of bacteriology.

[70]  Cynthia B. Phillips,et al.  Europa as an Abode of Life , 2004, Origins of life and evolution of the biosphere.

[71]  D. Söll,et al.  A Single Amidotransferase Forms Asparaginyl-tRNA and Glutaminyl-tRNA in Chlamydia trachomatis * , 2001, The Journal of Biological Chemistry.

[72]  N. Kyrpides,et al.  Universally conserved translation initiation factors. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[73]  T. Gold,et al.  The deep, hot biosphere. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[74]  Sophia Tsoka,et al.  The phylogenetic extent of metabolic enzymes and pathways. , 2003, Genome research.

[75]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[76]  C. Woese On the evolution of cells , 2002, Proceedings of the National Academy of Sciences of the United States of America.