Wastewater-Based Epidemiology to Describe the Evolution of SARS-CoV-2 in the South-East of Spain, and Application of Phylogenetic Analysis and a Machine Learning Approach

The COVID-19 pandemic has posed a significant global threat, leading to several initiatives for its control and management. One such initiative involves wastewater-based epidemiology, which has gained attention for its potential to provide early warning of virus outbreaks and real-time information on its spread. In this study, wastewater samples from two wastewater treatment plants (WWTPs) located in the southeast of Spain (region of Murcia), namely Murcia, and Cartagena, were analyzed using RT-qPCR and high-throughput sequencing techniques to describe the evolution of SARS-CoV-2 in the South-East of Spain. Additionally, phylogenetic analysis and machine learning approaches were applied to develop a pre-screening tool for the identification of differences among the variant composition of different wastewater samples. The results confirmed that the levels of SARS-CoV-2 in these wastewater samples changed concerning the number of SARS-CoV-2 cases detected in the population, and variant occurrences were in line with clinical reported data. The sequence analyses helped to describe how the different SARS-CoV-2 variants have been replaced over time. Additionally, the phylogenetic analysis showed that samples obtained at close sampling times exhibited a higher similarity than those obtained more distantly in time. A second analysis using a machine learning approach based on the mutations found in the SARS-CoV-2 spike protein was also conducted. Hierarchical clustering (HC) was used as an efficient unsupervised approach for data analysis. Results indicated that samples obtained in October 2022 in Murcia and Cartagena were significantly different, which corresponded well with the different virus variants circulating in the two locations. The proposed methods in this study are adequate for comparing consensus sequence types of the SARS-CoV-2 sequences as a preliminary evaluation of potential changes in the variants that are circulating in a given population at a specific time point.

[1]  J. Gómez-Pastora,et al.  Recent progress on wastewater-based epidemiology for COVID-19 surveillance: A systematic review of analytical procedures and epidemiological modeling , 2023, Science of The Total Environment.

[2]  D. Gerrity,et al.  Identification of a rare SARS-CoV-2 XL hybrid variant in wastewater and the subsequent discovery of two infected individuals in Nevada , 2022, Science of The Total Environment.

[3]  J. Rose,et al.  An exploration of challenges associated with machine learning for time series forecasting of COVID-19 community spread using wastewater-based epidemiological data , 2022, Science of The Total Environment.

[4]  L. Danon,et al.  An analysis of 45 large-scale wastewater sites in England to estimate SARS-CoV-2 community prevalence , 2022, Nature Communications.

[5]  E. Alm,et al.  Rapid displacement of SARS-CoV-2 variant Delta by Omicron revealed by allele-specific PCR in wastewater , 2022, Water Research.

[6]  E. Jiménez-Contreras,et al.  Differences in Global Scientific Production Between New mRNA and Conventional Vaccines Against COVID-19 , 2022, Environmental Science and Pollution Research.

[7]  Á. Holguín,et al.  Evolution of SARS-CoV-2 in Spain during the First Two Years of the Pandemic: Circulating Variants, Amino Acid Conservation, and Genetic Variability in Structural, Non-Structural, and Accessory Proteins , 2022, International journal of molecular sciences.

[8]  O. Pybus,et al.  Phylogenetic and phylodynamic approaches to understanding and combating the early SARS-CoV-2 pandemic , 2022, Nature Reviews Genetics.

[9]  E. Klein,et al.  The displacement of the SARS-CoV-2 variant Delta with Omicron: An investigation of hospital admissions and upper respiratory viral loads , 2022, eBioMedicine.

[10]  El Din,et al.  An ensemble agglomerative hierarchical clustering algorithm based on clusters clustering technique and the novel similarity measurement , 2022, J. King Saud Univ. Comput. Inf. Sci..

[11]  G. La Rosa,et al.  Wastewater-based epidemiology for early warning of SARS-COV-2 circulation: A pilot study conducted in Sicily, Italy , 2022, International Journal of Hygiene and Environmental Health.

[12]  M. Bracho,et al.  Spatial and temporal distribution of SARS-CoV-2 diversity circulating in wastewater , 2021, Water Research.

[13]  M. Farzan,et al.  Mechanisms of SARS-CoV-2 entry into cells , 2021, Nature reviews. Molecular cell biology.

[14]  G. La Rosa,et al.  Key SARS-CoV-2 Mutations of Alpha, Gamma, and Eta Variants Detected in Urban Wastewaters in Italy by Long-Read Amplicon Sequencing Based on Nanopore Technology , 2021, Water.

[15]  M. Scotch,et al.  Wastewater-Based Epidemiology and Long-Read Sequencing to Identify Enterovirus Circulation in Three Municipalities in Maricopa County, Arizona, Southwest United States between June and October 2020 , 2021, Viruses.

[16]  A. Bosch,et al.  Monitoring Emergence of the SARS-CoV-2 B.1.1.7 Variant through the Spanish National SARS-CoV-2 Wastewater Surveillance System (VATar COVID-19) , 2021, Environmental science & technology.

[17]  J. Sánchez-Fernández,et al.  Tourism research after the COVID-19 outbreak: Insights for more sustainable, local and smart cities , 2021, Sustainable Cities and Society.

[18]  Yaqian Zhao,et al.  Water science under the global epidemic of COVID-19: Bibliometric tracking on COVID-19 publication and further research needs , 2021, Journal of Environmental Chemical Engineering.

[19]  G. Sánchez,et al.  Comparing analytical methods to detect SARS-CoV-2 in wastewater , 2020, Science of The Total Environment.

[20]  Holly Else How a torrent of COVID science changed research publishing — in seven charts , 2020, Nature.

[21]  M. Zambon,et al.  Tracking SARS-CoV-2 in Sewage: Evidence of Changes in Virus Variant Predominance during COVID-19 Pandemic , 2020, Viruses.

[22]  F. Aarestrup,et al.  Monitoring SARS-CoV-2 circulation and diversity through community wastewater sequencing , 2020, medRxiv.

[23]  F. Luciani,et al.  Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis , 2020, Nature Communications.

[24]  Ana Allende,et al.  SARS-CoV-2 RNA in wastewater anticipated COVID-19 occurrence in a low prevalence area , 2020, Water Research.

[25]  J. Rose,et al.  SARS-CoV-2 in wastewater: State of the knowledge and research needs , 2020, Science of The Total Environment.

[26]  A. Schuchat,et al.  COVID-19: towards controlling of a pandemic , 2020, The Lancet.

[27]  Stephen S.-T. Yau,et al.  A Novel Approach to Clustering Genome Sequences Using Inter-nucleotide Covariance , 2019, Front. Genet..

[28]  Se-Ran Jun,et al.  Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutions , 2009, Proceedings of the National Academy of Sciences.