Genomic analyses of African Trypanozoon strains to assess evolutionary relationships and identify markers for strain identification

African trypanosomes of the sub-genus Trypanozoon) are eukaryotic parasitesthat cause disease in either humans or livestock. The development of genomic resources can be of great use to those interested in studying and controlling the spread of these trypanosomes. Here we present a large comparative analysis of Trypanozoon whole genomes, 83 in total, including human and animal infective African trypanosomes: 21 T. brucei brucei, 22 T. b. gambiense, 35 T. b. rhodesiense and 4 T. evansi strains, of which 21 were from Uganda. We constructed a maximum likelihood phylogeny based on 162,210 single nucleotide polymorphisms (SNPs.) The three Trypanosoma brucei sub-species and Trypanosoma evansi are not monophyletic, confirming earlier studies that indicated high similarity among Trypanosoma “sub-species”. We also used discriminant analysis of principal components (DAPC) on the same set of SNPs, identifying seven genetic clusters. These clusters do not correspond well with existing taxonomic classifications, in agreement with the phylogenetic analysis. Geographic origin is reflected in both the phylogeny and clustering analysis. Finally, we used sparse linear discriminant analysis to rank SNPs by their informativeness in differentiating the strains in our data set. As few as 84 SNPs can completely distinguish the strains used in our study, and discriminant analysis was still able to detect genetic structure using as few as 10 SNPs. Our results reinforce earlier results of high genetic similarity between the African Trypanozoon. Despite this, a small subset of SNPs can be used to identify genetic markers that can be used for strain identification or other epidemiological investigations.

[1]  A. Schnaufer,et al.  Multiple evolutionary origins of Trypanosoma evansi in Kenya , 2017, PLoS neglected tropical diseases.

[2]  G. Van der Auwera,et al.  Phylogenetic analysis of the Trypanosoma genus based on the heat-shock protein 70 gene. , 2016, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[3]  P. Büscher,et al.  New Trypanosoma evansi Type B Isolates from Ethiopian Dromedary Camels , 2016, PLoS neglected tropical diseases.

[4]  Mark J. Sistrom,et al.  De Novo Genome Assembly Shows Genome Wide Similarity between Trypanosoma brucei brucei and Trypanosoma brucei rhodesiense , 2016, PloS one.

[5]  E. Pennisi Genomics. Pocket DNA sequencers make real-time diagnostics a reality. , 2016, Science.

[6]  Mark J. Sistrom,et al.  Whole genome sequencing shows sleeping sickness relapse is due to parasite regrowth and not reinfection , 2016, Evolutionary applications.

[7]  M. Berriman,et al.  Population genomics reveals the origin and asexual evolution of human infective trypanosomes , 2015, eLife.

[8]  Jean Gao,et al.  Multiblock Discriminant Analysis for Integrative Genomic Study , 2015, BioMed research international.

[9]  Mark J. Sistrom,et al.  Genetic Diversity and Population Structure of Trypanosoma brucei in Uganda: Implications for the Epidemiology of Sleeping Sickness and Nagana , 2015, PLoS neglected tropical diseases.

[10]  A. Ivens,et al.  Genome and Phylogenetic Analyses of Trypanosoma evansi Reveal Extensive Similarity to T. brucei and Multiple Independent Origins for Dyskinetoplasty , 2015, PLoS neglected tropical diseases.

[11]  Mark J. Sistrom,et al.  Comparative Genomics Reveals Multiple Genetic Backgrounds of Human Pathogenicity in the Trypanosoma brucei Complex , 2014, Genome biology and evolution.

[12]  Tae-Ho Lee,et al.  SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data , 2014, BMC Genomics.

[13]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[14]  Pardis C. Sabeti,et al.  Genetic Surveillance Detects Both Clonal and Epidemic Transmission of Malaria following Enhanced Intervention in Senegal , 2013, PloS one.

[15]  Mark J. Sistrom,et al.  Trypanosoma brucei gambiense Group 1 Is Distinguished by a Unique Amino Acid Substitution in the HpHb Receptor Implicated in Human Serum Resistance , 2012, PLoS neglected tropical diseases.

[16]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[17]  Trevor J. Hastie,et al.  Sparse Discriminant Analysis , 2011, Technometrics.

[18]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[19]  J. Beadell,et al.  Phylogeography and Taxonomy of Trypanosoma brucei , 2011, PLoS neglected tropical diseases.

[20]  F. Balloux,et al.  Discriminant analysis of principal components: a new method for the analysis of genetically structured populations , 2010, BMC Genetics.

[21]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[22]  M. Quail,et al.  The Genome Sequence of Trypanosoma brucei gambiense, Causative Agent of Chronic Human African Trypanosomiasis , 2010, PLoS neglected tropical diseases.

[23]  V. Pungpapong,et al.  Case-control genome-wide association study of rheumatoid arthritis from Genetic Analysis Workshop 16 using penalized orthogonal-components regression-linear discriminant analysis , 2009, BMC proceedings.

[24]  Xihong Lin,et al.  Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection , 2009, Bioinform..

[25]  Holger Schwender,et al.  Classification with High‐Dimensional Genetic Data: Assigning Patients and Genetic Features to Known Classes , 2008, Biometrical journal. Biometrische Zeitschrift.

[26]  Pardis C Sabeti,et al.  A general SNP-based molecular barcode for Plasmodium falciparum identification and tracking , 2008 .

[27]  F. Ayala,et al.  Adaptations of Trypanosoma brucei to gradual loss of kinetoplast DNA: Trypanosoma equiperdum and Trypanosoma evansi are petite mutants of T. brucei , 2008, Proceedings of the National Academy of Sciences.

[28]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[29]  W. Gibson,et al.  Resolution of the species problem in African trypanosomes. , 2007, International journal for parasitology.

[30]  E. Fèvre,et al.  Sleeping sickness in Uganda: a thin line between two fatal diseases , 2005, BMJ : British Medical Journal.

[31]  David M. A. Martin,et al.  The Genome of the African Trypanosome Trypanosoma brucei , 2005, Science.

[32]  A. M. Dávila,et al.  The use of ITS1 rDNA PCR in detecting pathogenic African trypanosomes , 2005, Parasitology Research.

[33]  P. Majiwa,et al.  Variable Surface Glycoprotein RoTat 1.2 PCR as a specific diagnostic tool for the detection of Trypanosoma evansi infections , 2004, Kinetoplastid biology and disease.

[34]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[35]  S. Magez,et al.  Novel primer sequences for polymerase chain reaction-based detection of Trypanosoma brucei gambiense. , 2002, The American journal of tropical medicine and hygiene.

[36]  D. Perez-Morga,et al.  A receptor-like flagellar pocket glycoprotein specific to Trypanosoma brucei gambiense. , 2001, Molecular and biochemical parasitology.

[37]  R. Hamers,et al.  The serum resistance-associated (SRA) gene of Trypanosoma brucei rhodesiense encodes a variant surface glycoprotein-like protein. , 1994, Molecular and biochemical parasitology.

[38]  W. Gibson,et al.  Kinetoplast DNA and molecular karyotypes of Trypanosoma evansi and Trypanosoma equiperdum from China. , 1992, Molecular and biochemical parasitology.

[39]  P. Borst,et al.  Kinetoplast DNA of Trypanosoma evansi. , 1987, Molecular and biochemical parasitology.

[40]  W. Gibson Will the real Trypanosoma b. gambiense please stand up. , 1986, Parasitology today.

[41]  D. Godfrey,et al.  Epidemiological studies on the animal reservoir of Gambiense sleeping sickness. Part III. Characterization of trypanozoon stocks by isoenzymes and sensitivity to human serum. , 1982, Tropenmedizin und Parasitologie.

[42]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[43]  T. Marshall,et al.  Numerical analysis of enzyme polymorphism: a new approach to the epidemiology and taxonomy of trypanosomes of the subgenus Trypanozoon. , 1980, Advances in parasitology.