A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses.

The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community.

[1]  J. Schank,et al.  Pseudoreplication is a pseudoproblem. , 2009, Journal of comparative psychology.

[2]  P. Legendre,et al.  ANALYZING BETA DIVERSITY: PARTITIONING THE SPATIAL VARIATION OF COMMUNITY COMPOSITION DATA , 2005 .

[3]  Susan Holmes,et al.  phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data , 2013, PloS one.

[4]  Stuart H. Hurlbert,et al.  On misinterpretations of pseudoreplication and related matters: a reply to Oksanen , 2004 .

[5]  M. Gessner,et al.  Disconnect of microbial structure and function: enzyme activities and bacterial communities in nascent stream corridors , 2011, The ISME Journal.

[6]  Pierre Legendre,et al.  All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices , 2002 .

[7]  Lauri Oksanen,et al.  Logic of experiments in ecology: is pseudoreplication a pseudoissue? , 2001 .

[8]  Kalle Ruokolainen,et al.  Analyzing or explaining beta diversity? Understanding the targets of different methods of analysis. , 2006, Ecology.

[9]  W. Ziebis,et al.  Biodiversity of benthic microbial communities in bioturbated coastal sediments is controlled by geochemical microniches , 2009, The ISME Journal.

[10]  Alain F. Zuur,et al.  A protocol for data exploration to avoid common statistical problems , 2010 .

[11]  J. Schank,et al.  An ancient black art. , 2009, Journal of comparative psychology.

[12]  Thomas W Yee,et al.  Constrained additive ordination. , 2006, Ecology.

[13]  Renzo Kottmann,et al.  Megx.net: integrated database resource for marine ecological genomics , 2009, Nucleic Acids Res..

[14]  Susan M. Huse,et al.  Global Patterns of Bacterial Beta-Diversity in Seafloor and Seawater Ecosystems , 2011, PloS one.

[15]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[16]  Regina Nuzzo,et al.  Scientific method: Statistical errors , 2014, Nature.

[17]  P. Couteron,et al.  ANALYZING OR EXPLAINING BETA DIVERSITY? COMMENT. , 2008, Ecology.

[18]  Laura E. Green,et al.  The role of ecological theory in microbial ecology , 2007, Nature Reviews Microbiology.

[19]  J. Prosser Replicate or lie. , 2010, Environmental microbiology.

[20]  R. Coss Pseudoreplication conventions are testable hypotheses. , 2009, Journal of comparative psychology.

[21]  Celia M. Lombardi,et al.  The Ancient Black Art and Transdisciplinary Extent of Pseudoreplication Entreaty of a Youthful Offender , 2022 .

[22]  D. Chessel,et al.  From dissimilarities among species to dissimilarities among communities: a double principal coordinate analysis. , 2004, Journal of theoretical biology.

[23]  W. Härdle,et al.  Applied Multivariate Statistical Analysis , 2003 .

[24]  Charles E. McCulloch,et al.  MULTIVARIATE ANALYSIS IN ECOLOGY AND SYSTEMATICS: PANACEA OR PANDORA'S BOX? , 1990 .

[25]  S. Hurlbert Pseudoreplication and the Design of Ecological Field Experiments , 1984 .

[26]  Mary Ann Moran,et al.  Transcriptional response of bathypelagic marine bacterioplankton to the Deepwater Horizon oil spill , 2013, The ISME Journal.

[27]  Pierre Legendre,et al.  Studying beta diversity: ecological variation partitioning by multiple regression and canonical analysis , 2008 .

[28]  Rune Halvorsen Økland,et al.  Wise use of statistical tools in ecological field studies , 2007, Folia Geobotanica.

[29]  A. Boetius,et al.  Time- and sediment depth-related variations in bacterial diversity and community structure in subtidal sands , 2009, The ISME Journal.

[30]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[31]  Pierre Legendre,et al.  Numerical Ecology with R , 2011 .

[32]  Jizhong Zhou,et al.  Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities , 2013, mBio.

[33]  L. Zinger,et al.  Two decades of describing the unseen majority of aquatic microbial diversity , 2012, Molecular ecology.

[34]  P. Legendre,et al.  Variation partitioning of species data matrices: estimation and comparison of fractions. , 2006, Ecology.

[35]  P. Legendre,et al.  Spatial and Temporal Variation in a Caribbean Herbivorous Fish Assemblage , 2010 .

[36]  John Haigh,et al.  Applied multivariate statistical analysis (2nd edition) , by R. A. Johnson and D. W. Wichern. Pp 607. £17·95. 1988. ISBN 0-13-041138-8 (Prentice-Hall) , 1988, The Mathematical Gazette.

[37]  S. Wright The Method of Path Coefficients , 1934 .

[38]  A. Ramette Multivariate analyses in microbial ecology , 2007, FEMS microbiology ecology.

[39]  K. R. Clarke,et al.  Non‐parametric multivariate analyses of changes in community structure , 1993 .

[40]  C. Quince,et al.  Multivariate Cutoff Level Analysis (MultiCoLA) of large community data sets , 2010, Nucleic acids research.

[41]  H. Tuomisto,et al.  ANALYZING OR EXPLAINING BETA DIVERSITY? REPLY. , 2008, Ecology.

[42]  David I Warton,et al.  Regularized Sandwich Estimators for Analysis of High‐Dimensional Data Using Generalized Estimating Equations , 2011, Biometrics.

[43]  Jean Thioulouse,et al.  Multivariate analyses in soil microbial ecology: a new paradigm , 2012, Environmental and Ecological Statistics.

[44]  William A. Walters,et al.  Experimental and analytical tools for studying the human microbiome , 2011, Nature Reviews Genetics.

[45]  A. Halpern,et al.  The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific , 2007, PLoS biology.

[46]  Pierre Legendre,et al.  ANALYZING OR EXPLAINING BETA DIVERSITY? COMMENT. , 2008, Ecology.

[47]  Karl Cottenie,et al.  Comment to Oksanen (2001): reconciling Oksanen (2001) and Hurlbert (1984) , 2003 .

[48]  Stéphane Dray,et al.  Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM) , 2006 .

[49]  P. Legendre Species associations: the Kendall coefficient of concordance revisited , 2005 .

[50]  E. Laliberté ANALYZING OR EXPLAINING BETA DIVERSITY? COMMENT. , 2008, Ecology.

[51]  T Jombart,et al.  Genetic markers in the playground of multivariate analysis , 2009, Heredity.

[52]  A. Boetius,et al.  The energy–diversity relationship of complex bacterial communities in Arctic deep-sea sediments , 2011, The ISME Journal.

[53]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[54]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[55]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[56]  Marti J. Anderson,et al.  A new method for non-parametric multivariate analysis of variance in ecology , 2001 .

[57]  H. Hudson,et al.  A MANOVA STATISTIC IS JUST AS POWERFUL AS DISTANCE-BASED STATISTICS, FOR MULTIVARIATE ABUNDANCES , 2004 .

[58]  P. Bork,et al.  A Holistic Approach to Marine Eco-Systems Biology , 2011, PLoS biology.

[59]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[60]  D. Christopher Dryer,et al.  Wizards, guides, and beyond: rational and empirical methods for selecting optimal intelligent user interface agents , 1997, IUI '97.

[61]  D. Warton,et al.  Distance‐based multivariate analyses confound location and dispersion effects , 2012 .

[62]  Pierre Legendre,et al.  DISTANCE‐BASED REDUNDANCY ANALYSIS: TESTING MULTISPECIES RESPONSES IN MULTIFACTORIAL ECOLOGICAL EXPERIMENTS , 1999 .

[63]  Lauri Oksanen,et al.  The devil lies in details: reply to Stuart Hurlbert , 2004 .

[64]  Kessy Abarenkov,et al.  Resistance and resilience of the forest soil microbiome to logging-associated compaction , 2013, The ISME Journal.