A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes

The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.

[1]  Daniel H. Huson,et al.  SplitsTree: analyzing and visualizing evolutionary data , 1998, Bioinform..

[2]  Kevin Marsh,et al.  The role of antibodies to Plasmodium falciparum-infected-erythrocyte surface antigens in naturally acquired immunity to malaria. , 2002, Trends in microbiology.

[3]  George Githinji,et al.  Prognostic Indicators of Life-Threatening Malaria Are Associated with Distinct Parasite Variant Antigen Profiles , 2012, Science Translational Medicine.

[4]  John H. Morris,et al.  Improving the quality of protein similarity network clustering algorithms using the network edge weight distribution , 2011, Bioinform..

[5]  Jukka-Pekka Onnela,et al.  Community Structure in Time-Dependent, Multiscale, and Multiplex Networks , 2009, Science.

[6]  Thor G. Theander,et al.  Antibodies to Variant Antigens on the Surfaces of Infected Erythrocytes Are Associated with Protection from Malaria in Ghanaian Children , 2001, Infection and Immunity.

[7]  Ramón Doallo,et al.  ProtTest 3: fast selection of best-fit models of protein evolution , 2011, Bioinform..

[8]  G. McVean,et al.  Population Genomics of the Immune Evasion (var) Genes of Plasmodium falciparum , 2007, PLoS pathogens.

[9]  Gautam Aggarwal,et al.  Patterns of gene recombination shape var gene repertoires in Plasmodium falciparum: comparisons of geographically diverse isolates , 2007, BMC Genomics.

[10]  Tal Dagan,et al.  Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution , 2008, Proceedings of the National Academy of Sciences.

[11]  Philip Awadalla,et al.  Global genetic diversity and evolution of var genes associated with placental and severe childhood malaria. , 2006, Molecular and biochemical parasitology.

[12]  S. Kyes,et al.  Antigenic variation at the infected red cell surface in malaria. , 2001, Annual review of microbiology.

[13]  Caroline O. Buckee,et al.  An approach to classifying sequence tags sampled from Plasmodium falciparum var genes , 2007, Molecular and biochemical parasitology.

[14]  Marina Meila,et al.  Comparing clusterings: an axiomatic view , 2005, ICML.

[15]  X. Su,et al.  The large diverse gene family var encodes proteins involved in cytoadherence and antigenic variation of plasmodium falciparum-infected erythrocytes , 1995, Cell.

[16]  Joseph D. Smith,et al.  Evidence for the importance of genetic structuring to the structural and functional specialization of the Plasmodium falciparum var gene family , 2003, Molecular microbiology.

[17]  Matthias Frank,et al.  Frequent recombination events generate diversity within the multi-copy variant antigen gene families of Plasmodium falciparum. , 2008, International journal for parasitology.

[18]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[19]  Neil Hall,et al.  Plasmodium falciparum Variant Surface Antigen Expression Patterns during Malaria , 2005, PLoS pathogens.

[20]  D. Huson,et al.  A Survey of Combinatorial Methods for Phylogenetic Networks , 2010, Genome biology and evolution.

[21]  P. Awadalla The evolutionary genomics of pathogen recombination , 2003, Nature Reviews Genetics.

[22]  Geoffrey L. Johnston,et al.  Mitotic Evolution of Plasmodium falciparum Shows a Stable Core Genome but Recombination in Antigen Families , 2013, PLoS genetics.

[23]  Kevin Marsh,et al.  Antibody Recognition of Plasmodium falciparum Erythrocyte Surface Antigens in Kenya: Evidence for Rare and Prevalent Variants , 1999, Infection and Immunity.

[24]  Yun S. Song,et al.  Constructing Minimal Ancestral Recombination Graphs , 2005, J. Comput. Biol..

[25]  Thomas E. Ferrin,et al.  Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies , 2009, PloS one.

[26]  David Alvarez-Ponce,et al.  Gene similarity networks provide tools for understanding eukaryote origins and evolution , 2013, Proceedings of the National Academy of Sciences.

[27]  L. Orgel,et al.  The maintenance of the accuracy of protein synthesis and its relevance to ageing. , 1963, Proceedings of the National Academy of Sciences of the United States of America.

[28]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[29]  Thomas E. Wellems,et al.  Frequent ectopic recombination of virulence factor genes in telomeric chromosome clusters of P. falciparum , 2000, Nature.

[30]  Thomas S. Rask,et al.  Plasmodium falciparum Erythrocyte Membrane Protein 1 Diversity in Seven Genomes – Divide and Conquer , 2010, PLoS Comput. Biol..

[31]  Caroline O. Buckee,et al.  Evolution of the Multi-Domain Structures of Virulence Genes in the Human Malaria Parasite, Plasmodium falciparum , 2012, PLoS Comput. Biol..

[32]  B. Gamain,et al.  Classification of adhesive domains in the Plasmodium falciparum erythrocyte membrane protein 1 family. , 2000, Molecular and biochemical parasitology.

[33]  Kevin Marsh,et al.  Targets of antibodies against Plasmodium falciparum-infected erythrocytes in malaria immunity. , 2012, The Journal of clinical investigation.

[34]  Joseph D. Smith,et al.  A family affair: var genes, PfEMP1 binding, and malaria disease. , 2006, Current opinion in microbiology.

[35]  T. Theander,et al.  Sub-grouping of Plasmodium falciparum 3D7 var genes based on sequence analysis of coding and non-coding regions , 2003, Malaria Journal.

[36]  James O. McInerney,et al.  Evolutionary analyses of non-genealogical bonds produced by introgressive descent , 2012, Proceedings of the National Academy of Sciences.

[37]  Nebojsa Jojic,et al.  Discovering Patterns in Biological Sequences by Optimal Segmentation , 2007, UAI.

[38]  Kevin Marsh,et al.  Parasite antigens on the infected red cell surface are targets for naturally acquired immunity to malaria , 1998, Nature Medicine.

[39]  Jonathan E. Allen,et al.  Genome sequence of the human malaria parasite Plasmodium falciparum , 2002, Nature.

[40]  Caroline O Buckee,et al.  Plasmodium falciparum antigenic variation. Mapping mosaic var gene sequences onto a network of shared, highly polymorphic sequence blocks , 2008, Molecular microbiology.

[41]  Thomas M. Keane,et al.  Plasmodium falciparum var gene expression is modified by host immunity , 2009, Proceedings of the National Academy of Sciences.

[42]  Joseph Bockhorst,et al.  Structural Polymorphism and Diversifying Selection on the Pregnancy Malaria Vaccine Candidate Var2csa , 2007 .

[43]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[44]  Weltgesundheitsorganisation World malaria report , 2005 .

[45]  Marco Fondi,et al.  The horizontal flow of the plasmid resistome: clues from inter-generic similarity networks. , 2010, Environmental microbiology.

[46]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[47]  Christl A. Donnelly,et al.  Immunity to non-cerebral severe malaria is acquired after one or two infections , 1999, Nature Medicine.

[48]  Eric Bapteste,et al.  Network analyses structure genetic diversity in independent genetic worlds , 2009, Proceedings of the National Academy of Sciences.

[49]  T. Theander,et al.  Antibodies to variable Plasmodium falciparum-infected erythrocyte surface antigens are associated with protection from novel malaria infections. , 2000, Immunology letters.

[50]  C. Newbold,et al.  Plasmodium falciparum: the human agglutinating antibody response to the infected red cell surface is predominantly variant specific. , 1992, Experimental parasitology.

[51]  Wei Qian,et al.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. , 2000, Molecular biology and evolution.

[52]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[53]  David Posada,et al.  Automated phylogenetic detection of recombination using a genetic algorithm. , 2006, Molecular biology and evolution.