A novel strategy for classifying the output from an in silico vaccine discovery pipeline for eukaryotic pathogens using machine learning algorithms

BackgroundAn in silico vaccine discovery pipeline for eukaryotic pathogens typically consists of several computational tools to predict protein characteristics. The aim of the in silico approach to discovering subunit vaccines is to use predicted characteristics to identify proteins which are worthy of laboratory investigation. A major challenge is that these predictions are inherent with hidden inaccuracies and contradictions. This study focuses on how to reduce the number of false candidates using machine learning algorithms rather than relying on expensive laboratory validation. Proteins from Toxoplasma gondii, Plasmodium sp., and Caenorhabditis elegans were used as training and test datasets.ResultsThe results show that machine learning algorithms can effectively distinguish expected true from expected false vaccine candidates (with an average sensitivity and specificity of 0.97 and 0.98 respectively), for proteins observed to induce immune responses experimentally.ConclusionsVaccine candidates from an in silico approach can only be truly validated in a laboratory. Given any in silico output and appropriate training data, the number of false candidates allocated for validation can be dramatically reduced using a pool of machine learning algorithms. This will ultimately save time and money in the laboratory.

[1]  C. Sugimoto,et al.  Apical membrane antigen 1 is a cross-reactive antigen between Neospora caninum and Toxoplasma gondii, and the anti-NcAMA1 antibody inhibits host cell invasion by both parasites. , 2007, Molecular and biochemical parasitology.

[2]  Dominique Soldati-Favre,et al.  Functional dissection of the apicomplexan glideosome molecular architecture. , 2010, Cell host & microbe.

[3]  Kami Kim,et al.  Toxoplasma gondii: the model apicomplexan. , 2004, International journal for parasitology.

[4]  J. Dubremetz,et al.  Apical organelles and host-cell invasion by Apicomplexa. , 1998, International journal for parasitology.

[5]  F. Zou,et al.  Sequence variation in Toxoplasma gondii MIC13 gene among isolates from different hosts and geographical locations , 2012 .

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7]  M. Yaffe,et al.  The Rhoptry Proteins ROP18 and ROP5 Mediate Toxoplasma gondii Evasion of the Murine, But Not the Human, Interferon-Gamma Response , 2012, PLoS pathogens.

[8]  Dominique Soldati-Favre,et al.  Host-derived glucose and its transporter in the obligate intracellular pathogen Toxoplasma gondii are dispensable by glutaminolysis , 2009, Proceedings of the National Academy of Sciences.

[9]  A. Hemphill,et al.  Vaccination of mice with recombinant NcROP2 antigen reduces mortality and cerebral infection in mice infected with Neospora caninum tachyzoites. , 2008, International journal for parasitology.

[10]  Faramarz Valafar,et al.  Improving reverse vaccinology with a machine learning approach. , 2011, Vaccine.

[11]  T. Mann,et al.  Identification of the membrane receptor of a class XIV myosin in Toxoplasma gondii , 2004, The Journal of cell biology.

[12]  C. Kurz,et al.  Caenorhabditis elegans: an emerging genetic model for the study of innate immunity , 2003, Nature Reviews Genetics.

[13]  S. Krishna,et al.  Validation of the hexose transporter of Plasmodium falciparum as a novel drug target , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  I. Callebaut,et al.  Toxoplasma sortilin-like receptor regulates protein transport and is essential for apical secretory organelle biogenesis and host infection. , 2012, Cell host & microbe.

[15]  M. Wang,et al.  Increased survival time in mice vaccinated with a branched lysine multiple antigenic peptide containing B- and T-cell epitopes from T. gondii antigens. , 2011, Vaccine.

[16]  M. Reichel,et al.  Neospora caninum--how close are we to development of an efficacious vaccine that prevents abortion in cattle? , 2009, International journal for parasitology.

[17]  M. Ouellette,et al.  Reduced Infectivity of a Leishmania donovani Biopterin Transporter Genetic Mutant and Its Use as an Attenuated Strain for Vaccination , 2002, Infection and Immunity.

[18]  L. Sibley,et al.  Comparative genomic and phylogenetic analyses of calcium ATPases and calcium-regulated proteins in the apicomplexa. , 2006, Molecular biology and evolution.

[19]  Paul J. Kennedy,et al.  A guide to in silico vaccine discovery for eukaryotic pathogens , 2013, Briefings Bioinform..

[20]  L. Sibley,et al.  Comparison of the major antigens of Neospora caninum and Toxoplasma gondii. , 1999, International journal for parasitology.

[21]  P. Ossorio,et al.  A Toxoplasma gondii rhoptry protein associated with host cell penetration has unusual charge asymmetry. , 1992, Molecular and biochemical parasitology (Print).

[22]  I. Coppens,et al.  Toxoplasma gondii is capable of exogenous folate transport. A likely expansion of the BT1 family of transmembrane proteins. , 2005, Molecular and biochemical parasitology.

[23]  J. Dubey,et al.  Immunization with native surface protein NcSRS2 induces a Th2 immune response and reduces congenital Neospora caninum transmission in mice. , 2005, International journal for parasitology.

[24]  BMC Bioinformatics , 2005 .

[25]  J. Dubey,et al.  Identification and Characterization of Neospora caninum Cyclophilin That Elicits Gamma Interferon Production , 2005, Infection and Immunity.

[26]  Peter J Bradley,et al.  Proteomic Analysis of Rhoptry Organelles Reveals Many Novel Constituents for Host-Parasite Interactions in Toxoplasma gondii* , 2005, Journal of Biological Chemistry.

[27]  A. Vaughan,et al.  Genetically engineered, attenuated whole-cell vaccine approaches for malaria , 2010, Human vaccines.

[28]  John Sidney,et al.  A Systematic Assessment of MHC Class II Peptide Binding Predictions and Evaluation of a Consensus Approach , 2008, PLoS Comput. Biol..

[29]  P. Woodman,et al.  ATPase-defective mammalian VPS4 localizes to aberrant endosomes and impairs cholesterol trafficking. , 2000, Molecular biology of the cell.

[30]  Trevor Hastie,et al.  Statistical Models in S , 1991 .

[31]  B. Striepen,et al.  Apicoplast fatty acid synthesis is essential for organelle biogenesis and parasite survival in Toxoplasma gondii , 2006, Proceedings of the National Academy of Sciences.

[32]  S. Liddell,et al.  IMMUNIZATION OF MICE WITH PLASMID DNA CODING FOR NcGRA7 OR NcsHSP33 CONFERS PARTIAL PROTECTION AGAINST VERTICAL TRANSMISSION OF NEOSPORA CANINUM , 2003, The Journal of parasitology.

[33]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[34]  A. Hemphill,et al.  The major 36 kDa Neospora caninum tachyzoite surface protein is closely related to the major Toxoplasma gondii surface antigen. , 1998, Molecular and biochemical parasitology.

[35]  Francesco Filippini,et al.  NERVE: New Enhanced Reverse Vaccinology Environment , 2006, BMC biotechnology.

[36]  G. Labesse,et al.  ROP18 Is a Rhoptry Kinase Controlling the Intracellular Proliferation of Toxoplasma gondii , 2007, PLoS pathogens.

[37]  O. Lund,et al.  NetMHCpan, a method for MHC class I binding prediction beyond humans , 2008, Immunogenetics.

[38]  Paul Horton,et al.  Nucleic Acids Research Advance Access published May 21, 2007 WoLF PSORT: protein localization predictor , 2007 .

[39]  Kami Kim,et al.  The Toxoplasma gondii Rhoptry Protein ROP4 Is Secreted into the Parasitophorous Vacuole and Becomes Phosphorylated in Infected Cells , 2004, Eukaryotic Cell.

[40]  Philip E. Bourne,et al.  Immune epitope database analysis resource , 2012, Nucleic Acids Res..

[41]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[42]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[43]  진영규 Toxoplasmosis , 2020, Definitions.

[44]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[45]  J. Ellis,et al.  Isolation, characterization and expression of a GRA2 homologue from Neospora caninum , 2000, Parasitology.

[46]  S. Buus,et al.  Complete Protection against Lethal Toxoplasma gondii Infection in Mice Immunized with a Plasmid Encoding theSAG1 Gene , 1999, Infection and Immunity.

[47]  M. Brémont,et al.  Veterinary Research is now a full Open Access journal , 2011, Veterinary research.

[48]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[49]  Claudio Donati,et al.  Microbial genomes and vaccine design: refinements to the classical reverse vaccinology approach. , 2006, Current opinion in microbiology.

[50]  J. Kur,et al.  Comparison of immune response in sheep immunized with DNA vaccine encoding Toxoplasma gondii GRA7 antigen in different adjuvant formulations. , 2010, Experimental parasitology.

[51]  Bjoern Peters,et al.  Applications for T-cell epitope queries and tools in the Immune Epitope Database and Analysis Resource. , 2011, Journal of immunological methods.

[52]  E. Wherry,et al.  Vaccines: Effector and memory T-cell differentiation: implications for vaccine development , 2002, Nature Reviews Immunology.

[53]  Peter J. Bradley,et al.  A Thioredoxin Family Protein of the Apicoplast Periphery Identifies Abundant Candidate Transport Vesicles in Toxoplasma gondii , 2008, Eukaryotic Cell.

[54]  Matthew N Davies,et al.  Computer aided selection of candidate vaccine antigens , 2010, Immunome research.

[55]  P. Kane,et al.  Regulation of Vacuolar Proton-translocating ATPase Activity and Assembly by Extracellular pH* , 2010, The Journal of Biological Chemistry.

[56]  V. Pszenny,et al.  The novel coccidian micronemal protein MIC11 undergoes proteolytic maturation by sequential cleavage to remove an internal propeptide. , 2004, International journal for parasitology.

[57]  Walter Krämer,et al.  Review of Modern applied statistics with S, 4th ed. by W.N. Venables and B.D. Ripley. Springer-Verlag 2002 , 2003 .

[58]  K. Joiner,et al.  The expression of Toxoplasma proteins in Neospora caninum and the identification of a gene encoding a novel rhoptry protein. , 1997, Molecular and biochemical parasitology.

[59]  Yongqun He,et al.  Vaxign: The First Web-Based Vaccine Design Program for Reverse Vaccinology and Applications for Vaccine Development , 2010, Journal of biomedicine & biotechnology.

[60]  S. Urban,et al.  Intramembrane proteolysis of Toxoplasma apical membrane antigen 1 facilitates host-cell invasion but is dispensable for replication , 2012, Proceedings of the National Academy of Sciences.

[61]  Carlos J. Madrid-Aliste,et al.  Comprehensive Proteomic Analysis of Membrane Proteins in Toxoplasma gondii* , 2010, Molecular & Cellular Proteomics.

[62]  S. Hay,et al.  The global distribution of clinical episodes of Plasmodium falciparum malaria , 2005, Nature.

[63]  Joachim Müller,et al.  Vaccination with recombinant NcROP2 combined with recombinant NcMIC1 and NcMIC3 reduces cerebral infection and vertical transmission in mice experimentally infected with Neospora caninum tachyzoites. , 2009, International journal for parasitology.

[64]  M. Grigg,et al.  The SRS superfamily of Toxoplasma surface proteins. , 2004, International journal for parasitology.

[65]  A. Hemphill,et al.  Molecular characterization of a novel microneme antigen in Neospora caninum. , 2000, Molecular and biochemical parasitology.

[66]  S. R. Pereira,et al.  Toxoplasma gondii micronemal protein MIC1 is a lactose-binding lectin. , 2001, Glycobiology.

[67]  F. Eko,et al.  Immunolocalization and challenge studies using a recombinant Vibrio cholerae ghost expressing Trypanosoma brucei Ca(2+) ATPase (TBCA2) antigen. , 2009, The American journal of tropical medicine and hygiene.

[68]  J. Dubremetz,et al.  Export of a Toxoplasma gondii Rhoptry Neck Protein Complex at the Host Cell Membrane to Form the Moving Junction during Invasion , 2009, PLoS pathogens.

[69]  J. Dziadek,et al.  Toxoplasma gondii: the vaccine potential of three trivalent antigen-cocktails composed of recombinant ROP2, ROP4, GRA4 and SAG1 proteins against chronic toxoplasmosis in BALB/c mice. , 2012, Experimental parasitology.

[70]  C. Suárez,et al.  Neospora caninum: antibodies directed against tachyzoite surface protein NcSRS2 inhibit parasite attachment and invasion of placental trophoblasts in vitro. , 2006, Experimental parasitology.

[71]  J. Y. Kim,et al.  Interaction between parasitophorous vacuolar membrane-associated GRA3 and calcium modulating ligand of host cell endoplasmic reticulum in the parasitism of Toxoplasma gondii. , 2008, The Korean journal of parasitology.

[72]  J. Dubey,et al.  Newly recognized fatal protozoan disease of dogs. , 1988, Journal of the American Veterinary Medical Association.

[73]  N. Müller,et al.  Characterization of a cDNA-clone encoding Nc-p43, a major Neospora caninum tachyzoite surface protein , 1997, Parasitology.

[74]  A. Sette,et al.  Epitope-based vaccines: an update on epitope identification, vaccine design and delivery. , 2003, Current opinion in immunology.

[75]  David S. Roos,et al.  Themes and Variations in Apicomplexan Parasite Biology , 2005, Science.

[76]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[77]  R. Haselkorn,et al.  Growth of Toxoplasma gondii is inhibited by aryloxyphenoxypropionate herbicides targeting acetyl-CoA carboxylase. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[78]  F. Conraths,et al.  Peptide Microarray Analysis of In Silico-Predicted Epitopes for Serological Diagnosis of Toxoplasma gondii Infection in Humans , 2012, Clinical and Vaccine Immunology.

[79]  N. Müller,et al.  Vaccination of mice against experimental Neospora caninum infection using NcSAG1- and NcSRS2-based recombinant antigens and DNA vaccines , 2003, Parasitology.

[80]  R. Lyons,et al.  Toxoplasma gondii dense granule protein 3 (GRA3) is a type I transmembrane protein that possesses a cytoplasmic dilysine (KKXX) endoplasmic reticulum (ER) retrieval motif , 2005, Parasitology.

[81]  Brian D. Ripley,et al.  Modern applied statistics with S, 4th Edition , 2002, Statistics and computing.

[82]  M. J. Maiden Handbook of meningococcal disease , 2014 .

[83]  J. V. Van Beeumen,et al.  The microneme protein MIC3 of Toxoplasma gondii is a secretory adhesin that binds to both the surface of the host cells and the surface of the parasite , 2000, Cellular microbiology.

[84]  Srinivasan Ramachandran,et al.  Computer-aided biotechnology: from immuno-informatics to reverse vaccinology. , 2008, Trends in biotechnology.

[85]  E. Innes,et al.  Selection of Neospora caninum antigens stimulating bovine CD4+ve T cell responses through immuno-potency screening and proteomic approaches , 2011, Veterinary research.

[86]  Morten Nielsen,et al.  A Community Resource Benchmarking Predictions of Peptide Binding to MHC-I Molecules , 2006, PLoS Comput. Biol..

[87]  A. Hemphill,et al.  Identification and partial characterization of a 36 kDa surface protein on Neospora caninum tachyzoites , 1997, Parasitology.

[88]  Bernd Mayer,et al.  Machine learning approaches for prediction of linear B‐cell epitopes on proteins , 2006, Journal of molecular recognition : JMR.

[89]  C. Collin,et al.  The MIC3 Gene of Toxoplasma gondii Is a Novel Potent Vaccine Candidate against Toxoplasmosis , 2003, Infection and Immunity.

[90]  Huai-yu Zhou,et al.  Multi-epitope DNA vaccine linked to the A2/B subunit of cholera toxin protect mice against Toxoplasma gondii. , 2008, Vaccine.

[91]  Rino Rappuoli,et al.  Bridging the knowledge gaps in vaccine design , 2007, Nature Biotechnology.

[92]  S. Brunak,et al.  Locating proteins in the cell using TargetP, SignalP and related tools , 2007, Nature Protocols.

[93]  J. Boothroyd,et al.  A Toxoplasma Lectin-like Activity Specific for Sulfated Polysaccharides Is Involved in Host Cell Infection* , 1999, The Journal of Biological Chemistry.

[94]  Gajendra P.S. Raghava,et al.  Prediction of CTL epitopes using QM, SVM and ANN techniques. , 2004, Vaccine.

[95]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[96]  I. Gardner,et al.  Immune responses during pregnancy in heifers naturally infected with Neospora caninum with and without immunization , 2005, Parasitology Research.

[97]  R. Vemulapalli,et al.  Prevention of lethal experimental infection of C57BL/6 mice by vaccination with Brucella abortus strain RB51 expressing Neospora caninum antigens. , 2007, International journal for parasitology.

[98]  M. Reichel,et al.  Evaluation of recombinant proteins of Neospora caninum as vaccine candidates (in a mouse model). , 2008, Vaccine.

[99]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[100]  S. Brunak,et al.  SignalP 4.0: discriminating signal peptides from transmembrane regions , 2011, Nature Methods.

[101]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[102]  A. Hemphill,et al.  Subcellular localization and functional characterization of Nc-p43, a major Neospora caninum tachyzoite surface protein , 1996, Infection and immunity.

[103]  A. Hehl,et al.  Toxoplasma gondii Homologue ofPlasmodium Apical Membrane Antigen 1 Is Involved in Invasion of Host Cells , 2000, Infection and Immunity.

[104]  R. Haselkorn,et al.  Subcellular localization of acetyl-CoA carboxylase in the apicomplexan parasite Toxoplasma gondii , 2001, Proceedings of the National Academy of Sciences of the United States of America.