Genomic Epidemiology of SARS-CoV-2 in Guangdong Province, China

Highlights: 1) 1.6 million molecular diagnostic tests identified 1,388 SARS-CoV-2 infections in Guangdong Province, China, by 19th March 2020; 2) Virus genomes can be recovered using a variety of sequencing approaches from a range of patient samples. 3) Genomic analyses reveal multiple virus importations into Guangdong Province, resulting in genetically distinct clusters that require careful interpretation. 4) Large-scale epidemiological surveillance and intervention measures were effective in interrupting community transmission in Guangdong Summary: COVID-19 is caused by the SARS-CoV-2 coronavirus and was first reported in central China in December 2019. Extensive molecular surveillance in Guangdong, China's most populous province, during early 2020 resulted in 1,388 reported RNA positive cases from 1.6 million tests. In order to understand the molecular epidemiology and genetic diversity of SARS-CoV-2 in China we generated 53 genomes from infected individuals in Guangdong using a combination of metagenomic sequencing and tiling amplicon approaches. Combined epidemiological and phylogenetic analyses indicate multiple independent introductions to Guangdong, although phylogenetic clustering is uncertain due to low virus genetic variation early in the pandemic. Our results illustrate how the timing, size and duration of putative local transmission chains were constrained by national travel restrictions and by the province's large-scale intensive surveillance and intervention measures. Despite these successes, COVID-19 surveillance in Guangdong is still required as the number of cases imported from other countries is increasing.

[1]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[2]  K. Yuen,et al.  Clinical Characteristics of Coronavirus Disease 2019 in China , 2020, The New England journal of medicine.

[3]  Kai Zhao,et al.  A pneumonia outbreak associated with a new coronavirus of probable bat origin , 2020, Nature.

[4]  G. Leung,et al.  First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: a modelling impact assessment , 2020, The Lancet.

[5]  Daniel L. Ayres,et al.  Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10 , 2018, Virus evolution.

[6]  Marco A. R. Ferreira,et al.  Bayesian analysis of elapsed times in continuous‐time Markov chains , 2008 .

[7]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[8]  Nuno R. Faria,et al.  The effect of human mobility and control measures on the COVID-19 epidemic in China , 2020, Science.

[9]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[10]  Oliver G. Pybus,et al.  Precision epidemiology for infectious disease control , 2019, Nature Medicine.

[11]  Hayden C. Metsky,et al.  Genomic epidemiology reveals multiple introductions of Zika virus into the United States , 2017, Nature.

[12]  M. Kraemer,et al.  Reconstruction and prediction of viral disease epidemics , 2018, Epidemiology and Infection.

[13]  Trevor Bedford,et al.  Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples , 2017, Nature Protocols.

[14]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[15]  A proposal of alternative primers for the ARTIC Network’s multiplex PCR to improve coverage of SARS-CoV-2 genome sequencing , 2020 .

[16]  B. Rannala,et al.  Probability distribution of molecular evolutionary trees: A new method of phylogenetic inference , 1996, Journal of Molecular Evolution.

[17]  E. Holmes,et al.  A new coronavirus associated with human respiratory disease in China , 2020, Nature.

[18]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[19]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[20]  Ruifu Yang,et al.  An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China , 2020, Science.

[21]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[22]  Daniel L. Ayres,et al.  BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics , 2011, Systematic biology.

[23]  S. Mei,et al.  Evidence and characteristics of human-to-human transmission of SARS-CoV-2 , 2020, medRxiv.

[24]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.

[25]  U. Obolski,et al.  Genomic and epidemiological monitoring of yellow fever virus transmission potential , 2018, Science.

[26]  Malik Peiris,et al.  Viral dynamics in mild and severe cases of COVID-19 , 2020, The Lancet Infectious Diseases.