Decoding the evolution and transmissions of the novel pneumonia coronavirus (SARS-CoV-2 / HCoV-19) using whole genomic data

The outbreak of COVID-19 started in mid-December 2019 in Wuhan, China. Up to 29 February 2020, SARS-CoV-2 (HCoV-19 / 2019-nCoV) had infected more than 85 000 people in the world. In this study, we used 93 complete genomes of SARS-CoV-2 from the GISAID EpiFluTM database to investigate the evolution and human-to-human transmissions of SARS-CoV-2 in the first two months of the outbreak. We constructed haplotypes of the SARS-CoV-2 genomes, performed phylogenomic analyses and estimated the potential population size changes of the virus. The date of population expansion was calculated based on the expansion parameter tau (τ) using the formula t=τ/2u. A total of 120 substitution sites with 119 codons, including 79 non-synonymous and 40 synonymous substitutions, were found in eight coding-regions in the SARS-CoV-2 genomes. Forty non-synonymous substitutions are potentially associated with virus adaptation. No combinations were detected. The 58 haplotypes (31 found in samples from China and 31 from outside China) were identified in 93 viral genomes under study and could be classified into five groups. By applying the reported bat coronavirus genome (bat-RaTG13-CoV) as the outgroup, we found that haplotypes H13 and H38 might be considered as ancestral haplotypes, and later H1 was derived from the intermediate haplotype H3. The population size of the SARS-CoV-2 was estimated to have undergone a recent expansion on 06 January 2020, and an early expansion on 08 December 2019. Furthermore, phyloepidemiologic approaches have recovered specific directions of human-to-human transmissions and the potential sources for international infected cases.

[1]  Zunyou Wu,et al.  Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention. , 2020, JAMA.

[2]  Jing Zhao,et al.  Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia , 2020, The New England journal of medicine.

[3]  Huachen Zhu,et al.  Identification of 2019-nCoV related coronaviruses in Malayan pangolins in southern China , 2020, bioRxiv.

[4]  Wenjie Tan,et al.  A distinct name is needed for the new coronavirus , 2020, The Lancet.

[5]  S. Lo,et al.  A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster , 2020, The Lancet.

[6]  Cheng-wei Lu,et al.  2019-nCoV transmission through the ocular surface must not be ignored , 2020, The Lancet.

[7]  Wei Liu,et al.  Epidemiological and clinical features of the 2019 novel coronavirus outbreak in China , 2020, medRxiv.

[8]  S. Lindstrom,et al.  First Case of 2019 Novel Coronavirus in the United States , 2020, The New England journal of medicine.

[9]  Xin Li,et al.  Molecular epidemiology, evolution and phylogeny of SARS coronavirus , 2019, Infection, Genetics and Evolution.

[10]  Xiang Li,et al.  On the origin and continuing evolution of SARS-CoV-2 , 2020, National science review.

[11]  E. Holmes,et al.  A new coronavirus associated with human respiratory disease in China , 2020, Nature.

[12]  E. Holmes,et al.  The proximal origin of SARS-CoV-2 , 2020, Nature Medicine.

[13]  Colin Renfrew,et al.  Phylogenetic network analysis of SARS-CoV-2 genomes , 2020, Proceedings of the National Academy of Sciences.

[14]  Y. Bi,et al.  Zoonotic origins of human coronavirus 2019 (HCoV-19 / SARS-CoV-2): why is this work important? , 2020, Zoological research.

[15]  D. Heymann,et al.  COVID-19: what is next for public health? , 2020, The Lancet.

[16]  H. Harpending,et al.  Population growth makes waves in the distribution of pairwise genetic differences. , 1992, Molecular biology and evolution.

[17]  G. Barlow,et al.  Novel coronavirus disease (Covid-19): The first two patients in the UK with person to person transmission , 2020, Journal of Infection.

[18]  Kai Zhao,et al.  A pneumonia outbreak associated with a new coronavirus of probable bat origin , 2020, Nature.

[19]  D. Cyranoski Did pangolins spread the China coronavirus to people? , 2020 .

[20]  Tao Zhang,et al.  Pangolin homology associated with 2019-nCoV , 2020, bioRxiv.

[21]  Q. Pham,et al.  Importation and Human-to-Human Transmission of a Novel Coronavirus in Vietnam , 2020, The New England journal of medicine.

[22]  P. Vollmar,et al.  Transmission of 2019-nCoV Infection from an Asymptomatic Contact in Germany , 2020, The New England journal of medicine.

[23]  Olga Chernomor,et al.  IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era , 2020, Molecular biology and evolution.

[24]  M. Chan-yeung,et al.  SARS: epidemiology , 2003, Respirology.

[25]  K. Yuen,et al.  Clinical Characteristics of Coronavirus Disease 2019 in China , 2020, The New England journal of medicine.

[26]  Jon Cohen,et al.  Wuhan seafood market may not be source of novel virus spreading globally , 2020 .

[27]  Nan Tang,et al.  SARS-CoV-2 and viral sepsis: observations and hypotheses , 2020, The Lancet.

[28]  T. Kivisild,et al.  Phylogeographic differentiation of mitochondrial DNA in Han Chinese. , 2002, American journal of human genetics.

[29]  Trevor Bedford,et al.  MERS-CoV spillover at the camel-human interface , 2017, bioRxiv.

[30]  L. Excoffier,et al.  Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows , 2010, Molecular ecology resources.

[31]  D. Huson,et al.  Application of phylogenetic networks in evolutionary studies. , 2006, Molecular biology and evolution.

[32]  Yuelong Shu,et al.  GISAID: Global initiative on sharing all influenza data – from vision to reality , 2017, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[33]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[34]  Tao Zhang,et al.  Probable Pangolin Origin of SARS-CoV-2 Associated with the COVID-19 Outbreak , 2020, Current Biology.

[35]  A. M. Leontovich,et al.  The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2 , 2020, Nature Microbiology.

[36]  Q. Chau,et al.  Assessing Exposure to Cosmic Radiation during Long-haul Flights , 2000, Radiation research.

[37]  Daisuke Motooka,et al.  Mutation accumulation under UV radiation in Escherichia coli , 2017, Scientific Reports.

[38]  Qiong Chen,et al.  Insight into 2019 novel coronavirus — An updated interim review and lessons from SARS-CoV and MERS-CoV , 2020, International Journal of Infectious Diseases.

[39]  Juan C. Sánchez-DelBarrio,et al.  DnaSP 6: DNA Sequence Polymorphism Analysis of Large Data Sets. , 2017, Molecular biology and evolution.

[40]  Zhènglì Shí,et al.  Origin and evolution of pathogenic coronaviruses , 2018, Nature Reviews Microbiology.

[41]  Zhongming Zhao,et al.  Moderate mutation rate in the SARS coronavirus genome and its implications , 2004, BMC Evolutionary Biology.

[42]  H. Bandelt,et al.  Median-joining networks for inferring intraspecific phylogenies. , 1999, Molecular biology and evolution.

[43]  G. Leung,et al.  Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study , 2020, The Lancet.

[44]  Paul Kellam,et al.  Spread, Circulation, and Evolution of the Middle East Respiratory Syndrome Coronavirus , 2014, mBio.

[45]  G. Gao,et al.  A Novel Coronavirus from Patients with Pneumonia in China, 2019 , 2020, The New England journal of medicine.

[46]  Weidong Wu,et al.  Virology, Epidemiology, Pathogenesis, and Control of COVID-19 , 2020, Viruses.

[47]  Astrid Gall,et al.  Transmission and evolution of the Middle East respiratory syndrome coronavirus in Saudi Arabia: a descriptive genomic study , 2013, The Lancet.

[48]  E. Holmes,et al.  Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding , 2020, The Lancet.