Automated quality control for a molecular surveillance system

BackgroundMolecular surveillance and outbreak investigation are important for elimination of hepatitis C virus (HCV) infection in the United States. A web-based system, Global Hepatitis Outbreak and Surveillance Technology (GHOST), has been developed using Illumina MiSeq-based amplicon sequence data derived from the HCV E1/E2-junction genomic region to enable public health institutions to conduct cost-effective and accurate molecular surveillance, outbreak detection and strain characterization. However, as there are many factors that could impact input data quality to which the GHOST system is not completely immune, accuracy of epidemiological inferences generated by GHOST may be affected. Here, we analyze the data submitted to the GHOST system during its pilot phase to assess the nature of the data and to identify common quality concerns that can be detected and corrected automatically.ResultsThe GHOST quality control filters were individually examined, and quality failure rates were measured for all samples, including negative controls. New filters were developed and introduced to detect primer dimers, loss of specimen-specific product, or short products. The genotyping tool was adjusted to improve the accuracy of subtype calls. The identification of “chordless” cycles in a transmission network from data generated with known laboratory-based quality concerns allowed for further improvement of transmission detection by GHOST in surveillance settings. Parameters derived to detect actionable common quality control anomalies were incorporated into the automatic quality control module that rejects data depending on the magnitude of a quality problem, and warns and guides users in performing correctional actions. The guiding responses generated by the system are tailored to the GHOST laboratory protocol.ConclusionsSeveral new quality control problems were identified in MiSeq data submitted to GHOST and used to improve protection of the system from erroneous data and users from erroneous inferences. The GHOST system was upgraded to include identification of causes of erroneous data and recommendation of corrective actions to laboratory users.

[1]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[2]  John McNally,et al.  Sofosbuvir for previously untreated chronic hepatitis C infection. , 2013, The New England journal of medicine.

[3]  P Cattand,et al.  Sleeping sickness surveillance: an essential step towards elimination , 2001, Tropical medicine & international health : TM & IH.

[4]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[5]  Jonathan Mermin,et al.  A Framework for Elimination of Perinatal Transmission of HIV in the United States , 2012, Pediatrics.

[6]  David S. Campo,et al.  GHOST: global hepatitis outbreak and surveillance technology , 2017, BMC Genomics.

[7]  M. Tanner,et al.  Elimination of tropical disease through surveillance and response , 2013, Infectious Diseases of Poverty.

[8]  N. Gay,et al.  Measles elimination in the Americas. , 1996, JAMA.

[9]  J Lees,et al.  European framework for tuberculosis control and elimination in countries with a low incidence , 2002, European Respiratory Journal.

[10]  Sanjeev Arora,et al.  Ledipasvir and sofosbuvir for previously treated HCV genotype 1 infection. , 2014, The New England journal of medicine.

[11]  Global eradication of smallpox: WHO Global Commission for the Certification of Smallpox Eradication. , 1979, Journal of the Medical Association of Thailand = Chotmaihet thangphaet.

[12]  D A Henderson,et al.  Measles elimination in the Americas. Evolving strategies. , 1996, JAMA.

[13]  David S. Campo,et al.  Efficient detection of viral transmissions with Next-Generation Sequencing data , 2017, BMC Genomics.

[14]  Yury Khudyakov,et al.  Molecular surveillance of hepatitis C. , 2012, Antiviral therapy.

[15]  David S. Campo,et al.  Accurate Genetic Detection of Hepatitis C Virus Transmissions in Outbreak Settings. , 2016, The Journal of infectious diseases.

[16]  Brian L. Strom,et al.  A National Strategy for the Elimination of Hepatitis B and C: Phase Two Report , 2017 .