Quantifying environmental adaptation of metabolic pathways in metagenomics

Recently, approaches have been developed to sample the genetic content of heterogeneous environments (metagenomics). However, by what means these sequences link distinct environmental conditions with specific biological processes is not well understood. Thus, a major challenge is how the usage of particular pathways and subnetworks reflects the adaptation of microbial communities across environments and habitats—i.e., how network dynamics relates to environmental features. Previous research has treated environments as discrete, somewhat simplified classes (e.g., terrestrial vs. marine), and searched for obvious metabolic differences among them (i.e., treating the analysis as a typical classification problem). However, environmental differences result from combinations of many factors, which often vary only slightly. Therefore, we introduce an approach that employs correlation and regression to relate multiple, continuously varying factors defining an environment to the extent of particular microbial pathways present in a geographic site. Moreover, rather than looking only at individual correlations (one-to-one), we adapted canonical correlation analysis and related techniques to define an ensemble of weighted pathways that maximally covaries with a combination of environmental variables (many-to-many), which we term a metabolic footprint. Applied to available aquatic datasets, we identified footprints predictive of their environment that can potentially be used as biosensors. For example, we show a strong multivariate correlation between the energy-conversion strategies of a community and multiple environmental gradients (e.g., temperature). Moreover, we identified covariation in amino acid transport and cofactor synthesis, suggesting that limiting amounts of cofactor can (partially) explain increased import of amino acids in nutrient-limited conditions.

[1]  J. Claverie,et al.  Mimivirus relatives in the Sargasso sea , 2005, Virology Journal.

[2]  J. Priscu,et al.  Adaptation and Acclimation of Photosynthetic Microorganisms to Permanently Cold Environments , 2006, Microbiology and Molecular Biology Reviews.

[3]  E. Delong,et al.  Community Genomics Among Stratified Microbial Assemblages in the Ocean's Interior , 2006, Science.

[4]  Roman Stocker,et al.  Rapid chemotactic response enables marine bacteria to exploit ephemeral microscale nutrient patches , 2008, Proceedings of the National Academy of Sciences.

[5]  D. Scanlan,et al.  Light enhanced amino acid uptake by dominant bacterioplankton groups in surface waters of the Atlantic Ocean. , 2008, FEMS microbiology ecology.

[6]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[7]  Jo Handelsman,et al.  A statistical toolbox for metagenomics: assessing functional diversity in microbial communities , 2008, BMC Bioinformatics.

[8]  Andrew C. Tolonen,et al.  The genome of a motile marine Synechococcus , 2003, Nature.

[9]  Jean-Michel Claverie,et al.  Taxonomic distribution of large DNA viruses in the sea , 2008, Genome Biology.

[10]  Jillian F. Banfield,et al.  Community genomics in microbial ecology and evolution , 2005, Nature Reviews Microbiology.

[11]  P. Bork,et al.  Environments shape the nucleotide composition of genomes , 2005, EMBO reports.

[12]  M. Noordewier,et al.  Genome Streamlining in a Cosmopolitan Oceanic Bacterium , 2005, Science.

[13]  Sallie W. Chisholm,et al.  Niche Partitioning Among Prochlorococcus Ecotypes Along Ocean-Scale Environmental Gradients , 2006, Science.

[14]  A. Watson,et al.  Marine biological controls on climate via the carbon and sulphur geochemical cycles , 1998 .

[15]  S. Tringe,et al.  Comparative Metagenomics of Microbial Communities , 2004, Science.

[16]  J. Raes,et al.  Quantitative assessment of protein function prediction from metagenomics shotgun sequences , 2007, Proceedings of the National Academy of Sciences.

[17]  S. Tringe,et al.  Quantitative Phylogenetic Assessment of Microbial Communities in Diverse Environments , 2007, Science.

[18]  Vincent Carignan,et al.  Selecting Indicator Species to Monitor Ecological Integrity: A Review , 2002, Environmental monitoring and assessment.

[19]  Philippe Marlière,et al.  Adaptive eradication of methionine and cysteine from cyanobacterial light-harvesting proteins , 1989, Nature.

[20]  P. Bork,et al.  iPath: interactive exploration of biochemical pathways and networks. , 2008, Trends in biochemical sciences.

[21]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[22]  P. Bork,et al.  Prediction of effective genome size in metagenomic samples , 2007, Genome Biology.

[23]  Shah Ebrahim,et al.  Common variants in the GDF5-UQCC region are associated with variation in human height , 2008, Nature Genetics.

[24]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[25]  Rick L. Stevens,et al.  Functional metagenomic profiling of nine biomes , 2008, Nature.

[26]  Forest Rohwer,et al.  An application of statistics to comparative metagenomics , 2006, BMC Bioinformatics.

[27]  D. Karl,et al.  Nutrient dynamics in the deep blue sea. , 2002, Trends in microbiology.

[28]  M. Gerstein,et al.  Genomic analysis of regulatory network dynamics reveals large topological changes , 2004, Nature.