From Galactic archeology to soil metagenomics - surfing on massive data streams.

Soil microbiologists make their discovery in the dirt and rarely look at the stars. The words of Leonardo da Vinci ‘We know more about the movement of celestial bodies than about the soil underfoot’ are as applicable now as they were in 1510 and, currently, scientists deciphering soil microbe genomes and exploring the metagenomes of soil ecosystems may learn from their sky-gazing colleagues. Keeping their feet in the mud, but having their head in the sky, may help them to avoid the (meta)genome-analysis gridlock. Metagenomics involves sampling and sequencing the genome sequences of a community of organisms that inhabit a common environment, such as the ocean, the soil or the human gut (Handelsman, 2004; Hugenholtz & Tyson, 2008). Metagenomics provides an unbiased picture of the community structure (species richness and distribution) and its functional potential. It is rapidly moving from being a description tool to an experimental tool as a result of comparisons now being made of metagenomes submitted to environmental perturbations. Soils are home to microbial communities whose aggregate membership (5 · 10) (Whitman et al., 1998) outnumbers the stars shining in the sky (7 · 10). Soil complex habitats contain an estimated 10 to 10 species in a single gram (Gans et al., 2005; Curtis & Sloan, 2006). Once the genomes of these hundreds of thousands of species that crawl on weathered rocks, decaying organic matter and in the rhizosphere are catalogued, the streams of data arising from soil will equal those pouring from star-gazing telescopes. However, a major problem has taken the microbiologist community offguard: how to analyze this exponentially increasing amount of sequence data. As quoted from Kahvejian et al. (2008) ‘This surge of new data can be received as a flood, overwhelming the unsuspecting researcher, or as a tremendous wave that can be surfed to new horizons.’ Soil (meta)genomics sprang from fast advances in sequencing technology, and continued improvements are providing data in quantities unimaginable a few years ago. In the coming years, mining a day’s worth of data will take more than a day of large supercomputer time, and the fraction of the available data that we will be able to mine and analyze in a useful period of time will rapidly dwindle towards zero. Overcoming this bottleneck requires a shift in our frame of mind from mining databases to mining data streams. Recent papers (Singh et al., 2009) have highlighted the needs, requirements and challenges for sequencing microbial genomes and soil metagenomes in line with the Human Microbiome Project (http://nihroadmap.nih.gov/hmp/) and the Global Ocean Sampling Expedition (Rusch et al., 2007). Before discussing how other scientific communities are navigating through the hazards of massive data streams, we will summarize the current status of genomics and metagenomics of soil micro-organisms.

[1]  M. Zimmermann Xylem Structure and the Ascent of Sap , 1983, Springer Series in Wood Science.

[2]  John Quackenbush,et al.  What would you do if you could sequence everything? , 2008, Nature Biotechnology.

[3]  Susan M. Huse,et al.  Microbial diversity in the deep sea and the underexplored “rare biosphere” , 2006, Proceedings of the National Academy of Sciences.

[4]  T. May,et al.  Ectomycorrhizal lifestyle in fungi: global diversity, distribution, and evolution of phylogenetic lineages , 2010, Mycorrhiza.

[5]  Helmut Hillebrand,et al.  On the Generality of the Latitudinal Diversity Gradient , 2004, The American Naturalist.

[6]  T. Bruns,et al.  Host Specificity in Ectomycorrhizal Communities: What Do the Exceptions Tell Us?1 , 2002, Integrative and comparative biology.

[7]  Mark J. Bailey,et al.  TerraGenome: a consortium for the sequencing of a soil metagenome , 2009, Nature Reviews Microbiology.

[8]  Eduardo Serrano,et al.  LSST: From Science Drivers to Reference Design and Anticipated Data Products , 2008, The Astrophysical Journal.

[9]  Brian J Enquist,et al.  Ecological and evolutionary determinants of a key plant functional trait: wood density and its community-wide variation across latitude and elevation. , 2007, American journal of botany.

[10]  Mark Westoby,et al.  Land-plant ecology on the basis of functional traits. , 2006, Trends in ecology & evolution.

[11]  A. Fremier,et al.  Are true multihost fungi the exception or the rule? Dominant ectomycorrhizal fungi on Pinus sabiniana differ from those on co-occurring Quercus species. , 2009, The New phytologist.

[12]  W. Whitman,et al.  Prokaryotes: the unseen majority. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Nick Kaiser,et al.  Pan-STARRS: a wide-field optical survey telescope array , 2004, SPIE Astronomical Telescopes + Instrumentation.

[14]  E. Lilleskov,et al.  Can we develop general predictive models of mycorrhizal fungal community-environment relationships? , 2007, The New phytologist.

[15]  S. Kravitz,et al.  CAMERA: A Community Resource for Metagenomics , 2007, PLoS biology.

[16]  N. Kyrpides Fifteen years of microbial genomics: meeting the challenges and fulfilling the dream , 2009, Nature Biotechnology.

[17]  L. Tedersoo,et al.  Low diversity and high host preference of ectomycorrhizal fungi in Western Amazonia, a neotropical biodiversity hotspot , 2010, The ISME Journal.

[18]  L. Poorter,et al.  The importance of wood traits and hydraulic conductance for the performance and life history strategies of 42 rainforest tree species. , 2010, The New phytologist.

[19]  C. Bledsoe,et al.  Influence of host species on ectomycorrhizal communities associated with two co-occurring oaks (Quercus spp.) in a tropical cloud forest. , 2009, FEMS microbiology ecology.

[20]  Fnal,et al.  The Field of Streams: Sagittarius and its Siblings , 2006, astro-ph/0605025.

[21]  K. Abazajian,et al.  THE SEVENTH DATA RELEASE OF THE SLOAN DIGITAL SKY SURVEY , 2008, 0812.0649.

[22]  K. Jones,et al.  Massively parallel 454 sequencing indicates hyperdiverse fungal communities in temperate Quercus macrocarpa phyllosphere. , 2009, The New phytologist.

[23]  P. Reich,et al.  A handbook of protocols for standardised and easy measurement of plant functional traits worldwide , 2003 .

[24]  Paul M Kirk,et al.  Fungal ecology catches fire. , 2009, The New phytologist.

[25]  F. Martin,et al.  Pyrosequencing reveals a contrasted bacterial diversity between oak rhizosphere and surrounding soil. , 2010, Environmental microbiology reports.

[26]  J. Tiedje,et al.  Advantages of the metagenomic approach for soil exploration: reply from Vogel et al. , 2009, Nature Reviews Microbiology.

[27]  James M. Lund,et al.  A USER'S GUIDE TO THE PALOMAR SKY SURVEY , 1973 .

[28]  K. Nara,et al.  Host effects on ectomycorrhizal fungal communities: insight from eight host species in mixed conifer-broadleaf forests. , 2007, The New phytologist.

[29]  Susan M. Huse,et al.  Exploring Microbial Diversity and Taxonomy Using SSU rRNA Hypervariable Tag Sequencing , 2008, PLoS genetics.

[30]  Frans Bongers,et al.  Leaf traits are good predictors of plant performance across 53 rain forest species. , 2006, Ecology.

[31]  S. Davies,et al.  Potential Link between Plant and Fungal Distributions in a Dipterocarp Rainforest: Community and Phylogenetic Structure of Tropical Ectomycorrhizal Fungi across a Plant and Soil Ecotone , 2009 .

[32]  H. Schenk,et al.  Wood anatomy and wood density in shrubs: Responses to varying aridity along transcontinental transects. , 2009, American journal of botany.

[33]  K. Esler,et al.  Xylem density, biomechanics and anatomical traits correlate with water stress in 17 evergreen shrub species of the Mediterranean‐type climate region of South Africa , 2007 .

[34]  Rick L. Stevens,et al.  Functional metagenomic profiling of nine biomes , 2008, Nature.

[35]  M. Zobel,et al.  Large-scale parallel 454 sequencing reveals host ecological group specificity of arbuscular mycorrhizal fungi in a boreonemoral forest. , 2009, The New phytologist.

[36]  J. Chave,et al.  Towards a Worldwide Wood Economics Spectrum 2 . L E a D I N G D I M E N S I O N S I N W O O D F U N C T I O N , 2022 .

[37]  W. Cornwell,et al.  Wood density and vessel traits as distinct correlates of ecological strategy in 51 California coast range angiosperms. , 2006, The New phytologist.

[38]  F. Meinzer Functional convergence in plant responses to the environment , 2002, Oecologia.

[39]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[40]  M. Westoby,et al.  The relationship between stem biomechanics and wood density is modified by rainfall in 32 Australian woody plant species. , 2010, The New phytologist.

[41]  L. Forney,et al.  The tragedy of the uncommon: understanding limitations in the analysis of microbial diversity , 2008, The ISME Journal.

[42]  Frederick C Meinzer,et al.  Safety and efficiency conflicts in hydraulic architecture: scaling from tissues to trees. , 2008, Plant, cell & environment.

[43]  K. Nara,et al.  Underground primary succession of ectomycorrhizal fungi in a volcanic desert on Mount Fuji. , 2003, The New phytologist.

[44]  P. Sassone-Corsi,et al.  Computational Improvements Reveal Great Bacterial Diversity and High Metal Toxicity in Soil , 2022 .

[45]  Alexander F. Auch,et al.  MEGAN analysis of metagenomic data. , 2007, Genome research.

[46]  J. Handelsman Metagenomics: Application of Genomics to Uncultured Microorganisms , 2004, Microbiology and Molecular Biology Reviews.

[47]  S. Tringe,et al.  Comparative Metagenomics of Microbial Communities , 2004, Science.

[48]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[49]  Sergei L. Kosakovsky Pond,et al.  Windshield splatter analysis with the Galaxy metagenomic pipeline. , 2009, Genome research.

[50]  L. Poorter,et al.  Wood mechanics, allometry, and life-history variation in a tropical rain forest tree community. , 2006, The New phytologist.

[51]  M. Garbelotto,et al.  A strong species-area relationship for eukaryotic soil microbes: island size matters for ectomycorrhizal fungi. , 2007, Ecology letters.

[52]  S. Trumbore,et al.  Spatial separation of litter decomposition and mycorrhizal nitrogen uptake in a boreal forest. , 2007, The New phytologist.

[53]  Daniel S Falster,et al.  Angiosperm wood structure: Global patterns in vessel anatomy and their relation to wood density and potential conductivity. , 2010, American journal of botany.

[54]  L. Tedersoo,et al.  Diversity and community structure of ectomycorrhizal fungi in a wooded meadow. , 2006, Mycological research.

[55]  Inna Dubchak,et al.  The integrated microbial genomes (IMG) system , 2005, Nucleic Acids Res..

[56]  G. Casella,et al.  Pyrosequencing enumerates and contrasts soil microbial diversity , 2007, The ISME Journal.

[57]  W. Sloan,et al.  Exploring Microbial Diversity--A Vast Below , 2005, Science.

[58]  F. Martin,et al.  454 Pyrosequencing analyses of forest soils reveal an unexpectedly high fungal diversity. , 2009, The New phytologist.

[59]  L. Tedersoo,et al.  Strong host preference of ectomycorrhizal fungi in a Tasmanian wet sclerophyll forest as revealed by DNA barcoding and taxon-specific primers. , 2008, The New phytologist.

[60]  C. Bledsoe,et al.  Contrasting ectomycorrhizal fungal communities on the roots of co-occurring oaks (Quercus spp.) in a California woodland. , 2008, The New phytologist.