Global Diversity and Biogeography of Bacterial Communities in Wastewater Treatment 1 Plants 2 3

Global Diversity and Biogeography of Bacterial Communities in Wastewater Treatment 1 Plants 2 3 Linwei Wu†, Daliang Ning†, Bing Zhang†, Yong Li, Ping Zhang, Xiaoyu Shan, 4 Qiuting Zhang, Mathew Brown, Zhenxin Li, Joy D. Van Nostrand, Fangqiong Ling, Naijia 5 Xiao, Ya Zhang, Julia Vierheilig, George F. Wells, Yunfeng Yang, Ye Deng, Qichao 6 Tu, Aijie Wang, Global Water Microbiome Consortium‡, Tong Zhang, Zhili He, Jurg 7 Keller, Per H. Nielsen, Pedro J. J. Alvarez, Craig S. Criddle, Michael Wagner, James M. 8 Tiedje, Qiang He*, Thomas P. Curtis*, David A. Stahl, Lisa Alvarez-Cohen, Bruce E. 9 Rittmann, Xianghua Wen*, and Jizhong Zhou* 10 11 12 State Key Joint Laboratory of Environment Simulation and Pollution Control, School of 13 Environment, Tsinghua University, Beijing, China; Institute for Environmental Genomics, 14 Department of Microbiology and Plant Biology, and School of Civil Engineering and 15 Environmental Sciences, University of Oklahoma, Norman, OK, USA; Consolidated Core 16 Laboratory, University of Oklahoma, Norman, Oklahoma, USA; College of Resource & 17 Environment Southwest University, Chongqing, China; Alkek Center for Metagenomics and 18 Microbiome Research, Department of Molecular Virology and Microbiology, Baylor College of 19 Medicine, Houston, TX, USA; School of Engineering, Newcastle University, Newcastle upon 20 Tyne, UK; School of Environment, Northeastern Normal University, Changchun, China; 21 Department of Energy, Environmental and Chemical Engineering, Washington University in St. 22 Louis, MO, USA; Department of Microbiology and Ecosystem Science, Division of Microbial 23 Ecology, Research Network 'Chemistry meets Microbiology', University of Vienna, Vienna, 24 Austria; Karl Landsteiner University of Health Sciences, Division of Water Quality and Health, 25 Krems, Austria & Interuniversity Cooperation Centre for Water and Health; Department of 26 Civil and Environmental Engineering, Northwestern University, Evanston, IL, USA; Institute 27 for Marine Science and Technology, Shandong University, Qingdao, China; Research Center 28 for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China; 29 Environmental Biotechnology Laboratory, The University of Hong Kong, Hong Kong, China; 30 Environmental Microbiomics Research Center, School of Environmental Science and 31 Engineering, Sun Yat-Sen University, Guangzhou, China; Guangdong Provincial Key 32 Laboratory of Environmental Pollution Control and Remediation Technology, Guangzhou, 33 China; Advanced Water Management Centre, The University of Queensland, Brisbane, QLD, 34 Australia; Department of Chemistry and Bioscience, Center for Microbial Communities, 35 Aalborg University, Aalborg, Denmark; Department of Civil and Environmental Engineering, 36 Rice University, Houston, TX, USA; Department of Civil and Environmental Engineering, 37 Stanford University, Stanford, CA, USA; Center for Microbial Ecology, Michigan State 38 University, East Lansing, MI, USA; Department of Civil and Environmental Engineering, The 39 University of Tennessee, Knoxville, TN, USA; Institute for a Secure and Sustainable 40 Environment, The University of Tennessee, Knoxville, TN, USA; Department of Civil and 41 Environmental Engineering, University of Washington, Seattle, WA, USA; Department of 42 Civil and Environmental Engineering, College of Engineering, University of California, 43 Berkeley, CA, USA; Earth and Environmental Sciences, Lawrence Berkeley National 44 Laboratory, Berkeley, CA, USA; Biodesign Swette Center for Environmental Biotechnology, 45 Arizona State University, Tempe, AZ, USA. 46 This document is the accepted manuscript version of the following article:

[1]  Hélène Morlon,et al.  Spatial patterns of phylogenetic diversity , 2011, Ecology letters.

[2]  N. Fierer,et al.  A global atlas of the dominant bacteria found in soil , 2018, Science.

[3]  A. Klindworth,et al.  Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies , 2012, Nucleic acids research.

[4]  A. Konopka,et al.  Quantifying community assembly processes and identifying features that impose them , 2013, The ISME Journal.

[5]  Campbell O. Webb,et al.  Picante: R tools for integrating phylogenies and ecology , 2010, Bioinform..

[6]  Kai Xue,et al.  Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform , 2017, PloS one.

[7]  Barry Smith,et al.  The environment ontology: contextualising biological and biomedical entities , 2013, Journal of Biomedical Semantics.

[8]  Per Halkjær Nielsen,et al.  MiDAS 2.0: an ecosystem-specific taxonomy and online database for the organisms of wastewater treatment systems expanded for anaerobic digester groups , 2017, Database J. Biol. Databases Curation.

[9]  Ilkka Hanski,et al.  Dynamics of regional distribution: the core and satellite species hypothesis , 1982 .

[10]  Eric P. Nawrocki,et al.  An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea , 2011, The ISME Journal.

[11]  Yves Rosseel,et al.  lavaan: An R Package for Structural Equation Modeling , 2012 .

[12]  Sadahiro Yamamoto,et al.  Global, regional, and country level need for data on wastewater generation, treatment, and use , 2013 .

[13]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[14]  William A. Walters,et al.  Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample , 2010, Proceedings of the National Academy of Sciences.

[15]  S. Langenheder,et al.  The importance of species sorting differs between habitat generalists and specialists in bacterial communities. , 2014, FEMS microbiology ecology.

[16]  Ye Deng,et al.  Phasing amplicon sequencing on Illumina Miseq for robust environmental microbial community analysis , 2015, BMC Microbiology.

[17]  Jizhong Zhou,et al.  Reproducibility and quantitation of amplicon sequencing-based detection , 2011, The ISME Journal.

[18]  M. Wagner,et al.  Complete nitrification by Nitrospira bacteria , 2015, Nature.

[19]  P. Legendre,et al.  MODELING BRAIN EVOLUTION FROM BEHAVIOR: A PERMUTATIONAL REGRESSION APPROACH , 1994, Evolution; international journal of organic evolution.

[20]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[21]  D. Faith Conservation evaluation and phylogenetic diversity , 1992 .

[22]  R. Blair Good to the last drop. , 1983, The Canadian nurse.

[23]  Emily S. Charlson,et al.  Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications , 2011, Nature Biotechnology.

[24]  Steven W Kembel,et al.  Disentangling niche and neutral influences on community assembly: assessing the performance of community phylogenetic structure tests. , 2009, Ecology letters.

[25]  C. Huttenhower,et al.  Assessment of variation in microbial community amplicon sequencing by the Microbiome Quality Control (MBQC) project consortium , 2017, Nature Biotechnology.

[26]  J. Salojärvi,et al.  Discordant temporal development of bacterial phyla and the emergence of core in the fecal microbiota of young children , 2015, The ISME Journal.

[27]  Jonathan M. Chase,et al.  Using null models to disentangle variation in community dissimilarity from variation in α‐diversity , 2011 .

[28]  E. Casamayor,et al.  Ecology of the rare microbial biosphere of the Arctic Ocean , 2009, Proceedings of the National Academy of Sciences.

[29]  Robert C. Edgar,et al.  UPARSE: highly accurate OTU sequences from microbial amplicon reads , 2013, Nature Methods.

[30]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[31]  P Foladori,et al.  Direct quantification of bacterial biomass in influent, effluent and activated sludge of wastewater treatment plants by using flow cytometry. , 2010, Water research.

[32]  Kenneth A. Bollen,et al.  Representing general theoretical concepts in structural equation models: the role of composite variables , 2008, Environmental and Ecological Statistics.

[33]  Ow,et al.  The Sources and Solutions: Wastewater , 2013 .

[34]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[35]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[36]  D. Higgins,et al.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega , 2011, Molecular systems biology.

[37]  J. Tiedje,et al.  Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy , 2007, Applied and Environmental Microbiology.

[38]  Tong Zhang,et al.  Bacterial assembly and temporal dynamics in activated sludge of a full-scale municipal wastewater treatment plant , 2014, The ISME Journal.

[39]  Jizhong Zhou,et al.  Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities , 2013, mBio.

[40]  Jizhong Zhou,et al.  Climate warming leads to divergent succession of grassland microbial communities , 2018, Nature Climate Change.